Building a translational cancer dependency map for The Cancer Genome Atlas

Predictive modeling of gene essentialityTo begin building the translational dependency maps, predictive models of gene essentiality were trained on genome-wide CRISPR-Cas9 knockout screens from the DEPMAP8 using elastic-net regularization for feature selection and modeling23 (Fig. 1a). Genome-wide gene essentiality scores for DEPMAP cancer cell models (n = 897) were estimated by CERES24, which measures the essentiality of each gene relative to the distribution of effect sizes for common essential and nonessential genes within each cell line25. Because many genes do not impact cell viability, elastic-net models were attempted only for genes with at least five dependent and nondependent cell lines, which included 7,260 out of 18,119 genes (40%) with gene essentiality scores in the DEPMAP. In addition to gene essentiality scores, the input variables for elastic-net predictive modeling included genome-wide gene expression, mutation and copy number profiles for each cancer cell model. Based on previous evidence that predictive modeling of gene essentiality with RNA expression performed comparably to similar modeling that also included DNA features26,27, two sets of elastic-net models were compared using RNA alone (expression only) or combined with mutation and copy number profiles (multi-omics). Finally, the best fitting elastic-net models were selected by a tenfold cross-validation to identify models with the minimum error, while balancing the predictive performance with the number of features selected (Methods).Fig. 1: Predictive modeling of gene essentiality in the DEPMAP.a, Schematic of the elastic-net models for predictive modeling of gene essentiality in the DEPMAP using expression-only data or multi-omics data. Note the broad overlap in cross-validated models using expression-only or multi-omics data. b, Distribution of the number features per multi-omics model. c, Distribution of the number of features per expression-only model. d, Number of features per multi-omics model that passed (n = 2,045) or failed (n = 5,215) cross-validation based on a correlation coefficient of 0.2 threshold. e, Number of features per expression-only model that passed (n = 1,966) or failed (5,294) cross-validation based on a correlation coefficient of 0.2 threshold. For d and e, the center horizontal line represents the median (50th percentile) value. The box spans from the 25th to the 75th percentile. The whiskers indicate the fifth and 95th percentiles. f, Rank of the target gene (self) as a feature in the cross-validated multi-omics models. g, Rank of the target gene (self) as a feature in the cross-validated expression-only models. h, Comparison of model performance (correlation coefficients) of cross-validated models from multi-omics and expression-only data. Note for b–h that the performance and characteristics of multi-omics and expression-only models are very similar. P values indicated on graphs were determined by the Wilcoxon rank-sum test for two-group comparison (d and e).Source dataThe elastic-net models for predicting essentiality of the 7,260 genes (as described above) were compared by tenfold cross-validation (Pearson’s r > 0.2; false discovery rate (FDR) < 1 × 10−3) when considering expression-only or multi-omics data as input variables (Supplementary Tables 1 and 2). The distribution of features per model skewed higher in the multi-omics models (3–510 features, median of 98) (Fig. 1b) compared to the expression-only models (3–369 features, median of 80) (Fig. 1c) and the performance of both improved with the number of features per model (Fig. 1d,e). Of the 7,260 models, cross-validation confirmed 1,966 expression-only models and 2,045 multi-omics models, of which most cross-validated models overlapped (n = 1,797) (Supplementary Table 3). The incidence of self-inclusion of the target gene in the cross-validated models was also similar between multi-omics dataset (31% of models) (Fig. 1f) and expression-only dataset (26% of models) (Fig. 1g). The majority of cross-validated models (76%) performed comparably (within a correlation coefficient of 0.05) using either expression-only or multi-omics data. Likewise, 86 out of 103 annotated oncogenes (84%) with cross-validated models performed similarly using either expression-only or multi-omics datasets (for example, HER2, BRAF and PIK3CA), with a few notable examples that included the oncogenes: NRAS, FLT3 and ARNT (Fig. 1h and Extended Data Fig. 1a–e). Collectively, these data demonstrate that predictive models of gene essentiality with expression-only (Supplementary Table 1) and multi-omics (Supplementary Table 2) data as input variables perform comparably in detecting selective vulnerabilities of cancer in most cases (Supplementary Table 3).Constructing TCGADEPMAP
TCGADEPMAP was built using the expression-only elastic-net models of gene essentiality, based on the evidence here (Fig. 1) and elsewhere26,27 that the performance of most models was comparable to those including genomic features. Moreover, as genetic information is withheld from the expression-only elastic-net models, the transposed essentiality scores can be correlated with genetic drivers in TCGADEPMAP patients who might otherwise be missed in cancer cell models. Finally, expression-based predictive modeling of essentiality can also be extended to non-oncological studies (for example, GTEX), which do not have somatic mutations and copy number changes28.As outlined in Fig. 2a, the expression-based predictive models of DEPMAP dependencies were transposed using the transcriptomic profiles of 9,596 TCGA patients, following alignment to account for differences between the expression profiles of cell lines and tumor biopsies with varying stromal content. The importance of transcriptional alignment was evident from the strong correlation of the 1,966 cross-validated gene essentiality models with the tumor purity of TCGA samples (Fig. 2b). To overcome this issue, expression data from DEPMAP and TCGA were quantile normalized and transformed by contrastive principal-component analysis (cPCA), which is a generalization of the PCA that detects correlated variance components that differ between two datasets. The removal of the top four principal components (cPC1–4) between the DEPMAP and TCGA transcriptomes significantly reduced the correlation of tumor dependencies with tumor purity (Fig. 2b) and improved the alignment of the expression-based dependency models (Fig. 2c,d and Extended Data Fig. 1f–h). Enrichment analysis of gene essentiality scores with correlation coefficients that changed the most between the pre- and post-aligned models revealed a significant enrichment of pathways related to the stroma (Supplementary Table 4). Combined, these data demonstrate that without transcriptional alignment, the predicted gene essentialities in patient samples were strongly correlated with tumor purity, which should not be the case when one considers that these dependency models were generated using cultured cancer cell lines without stroma.Fig. 2: Building a translational dependency map: TCGADEPMAP.a, Schematic of gene essentiality model transposition from DEPMAP to TCGA, following alignment of genome-wide expression data to account for differences in homogeneous cultured cell lines and heterogenous tumor biopsies with stroma. b, Coefficient of determination (R2) of the cross-validated gene essentiality models and tumor purity before (n = 1,966) and after transcriptional alignment (n = 1,966). The center horizontal line represents the median (50th percentile) value. The box spans from the 25th to the 75th percentile. The whiskers indicate the fifth and 95th percentiles. A two-sided Wilcoxon rank-sum test was performed to test for statistical significance. c, Uniform Manifold Approximation and Projection (UMAP) visualization of normalization of genome-wide transcriptomes improves alignment between cultured cells and patient tumor biopsies with contaminating stroma. d, Correlation coefficients of essentiality profiles of different lineages of cultured cell models and TCGA patient tumors. e, Unsupervised clustering of predicted gene essentiality scores across TCGADEPMAP revealed strong lineage dependencies. Blue indicates genes with stronger essentiality and red indicates genes with less essentiality. f, KRAS dependency was enriched in TCGADEPMAP lineages (n = 9,593) with high frequency of KRAS GOF mutations, including colon adenocarcinoma (COAD), LUAD, STAD, READ, esophageal carcinoma (ESCA) and PAAD. g, KRAS essentiality correlated with KRAS mutations in all TCGADEPMAP lineages (n = 532 for KRASmut and n = 7,049 for KRASwt). h, BRAF dependency in TCGADEPMAP (n = 9,593) was enriched in SKCM, which has a high frequency of GOF mutations in BRAF. i, BRAF essentiality correlated with BRAF mutations in all TCGADEPMAP lineages (n = 559 for BRAFmut and n = 7,022 for BRAFwt). For f–i, the center horizontal line represents the median (50th percentile) value. The box spans from the 25th to the 75th percentile. The whiskers indicate the fifth and 95th percentiles. For g–i, a two-sided Wilcoxon rank-sum test was performed to test for statistical significance. j, Scatter-plot of model selectivity in TCGADEPMAP and DEPMAP, as determined by normality likelihood (NormLRT). k, Ranking of model selectivity between in TCGADEPMAP and DEPMAP, as determined by the NormLRT scores. ***P < 0.001, as determined by the Wilcoxon rank-sum test for two-group comparison and Kruskal–Wallis followed by Wilcoxon rank-sum test with multiple test correction for the multi-group comparison. CNS, central nervous system; PNS, peripheral nervous system; ACC, adrenocortical carcinoma; BLCA, bladder urothelial carcinoma; CESC, cervical and endocervical cancers; CHOL, cholangiocarcinoma; GBM, glioblastoma multiforme; HNSC, head and neck squamous cell carcinoma; KIRC, kidney renal clear cell carcinoma; KIRP, kidney renal papillary cell carcinoma; LGG, lower-grade glioma; LIHC, liver hepatocellular carcinoma; MESO, mesothelioma; OV, ovarian serous cystadenocarcinoma; PRAD, prostate adenocarcinoma; SARC, sarcoma; TGCT, testicular germ cell tumors; THCA, thyroid carcinoma; THYM, thymoma; UCEC, uterine corpus endometrial carcinoma; UCS, uterine carcinosarcoma; UVM, uveal melanoma.Source dataTo further benchmark the accuracy of TCGADEPMAP, we tested whether gene essentiality in patient tumors could predict tumor lineages and oncogene dependencies, as has been reported in the cell-based dependency maps8. The predicted negative values indicate higher predicted essentiality. Unsupervised clustering of gene essentialities across TCGADEPMAP revealed striking lineage dependencies (Fig. 2e and Supplementary Table 5), including well-known oncogenes such as KRAS (Fig. 2f,g) and BRAF (Fig. 2h,i). For example, KRAS essentiality was markedly stronger in KRAS-mutant stomach adenocarcinoma (STAD), rectal adenocarcinoma (READ), pancreatic adenocarcinoma (PAAD) and colon adenocarcinoma (COAD) lineages (Fig. 2f,g), whereas BRAF essentiality was strongest in BRAF-mutant skin cutaneous melanoma (SKCM) (Fig. 2h,i). We more broadly compared oncogene essentiality in TCGA patients with or without a gain-of-function (GOF) event (mutation or amplification), using the list of 100 cross-validated models for oncogenes from the Cosmic Cancer Gene Census (https://cancer.sanger.ac.uk/census). Of the 100 oncogenes, a total of 85 gene essentialities predicted stronger dependencies in patients with a GOF event (Supplementary Table 6). To ensure that the associations between dependencies and mutations were not due to the same underlying predictive features, the accuracy of elastic-net models to predict essentiality and somatic mutations in the same genes were compared. The comparison was restricted to genes with cross-validated models of essentiality and somatic mutations with >2% prevalence (n = 891 models). The elastic-net models were allowed to select the most informative predictive features for mutation and essentiality for each gene, as the best predictors for essentiality may not be the best features to predict mutation. Comparison of the area under the curve (AUC) of the two model sets revealed that transcriptomic features were significantly more predictive of gene essentiality compared to mutational status (Extended Data Fig. 1i). Considering that the expression-only models of essentiality did not include genomic features, these data further demonstrate that the essentiality scores in TCGADEPMAP can be independently correlated with genomic features in patient tumors. Combined with the evidence that cross-validated gene essentiality models accurately predict cancer lineages, these data suggest that the cross-validated gene essentiality models are accurate and interpretable across a wide range of biological contexts, including oncogenic dependencies.Selective dependencies in TCGADEPMAP
Strongly selective dependencies (SSDs) have been characterized in cell-based maps using the normality likelihood ratio test (NormLRT) to rank whether an essentiality fits a normal or t-skewed distribution (selective) across the cohort20,29. A strength of this approach is the ability to rank SSDs regardless of the underlying mechanisms of dependency (for example, lineage, genetic and expression). To compare the SSDs in patients with cancer and cell models, NormLRT was applied to gene effect scores for the cross-validated essentiality models in TCGADEPMAP and DEPMAP, respectively. Most SSDs (NormLRT > 100) correlated well between TCGADEPMAP and DEPMAP (r = 0.56, P < 0.0001), including KRAS, BRAF, MYCN and many other known SSDs (Fig. 2j and Supplementary Table 7). Although most SSDs correlated well between TCGADEPMAP and DEPMAP, there were several examples where the SSDs differed between patients and cell models (Fig. 2j,k). Notably, the druggable oncogenes (for example, FLT3 and PTPN11) were more prominent SSDs in TCGADEPMAP patients than DEPMAP cell lines, whereas other notable SSDs in the DEPMAP (for example, ATP6V0E1) were less noticeable in TCGADEPMAP (Fig. 2j,k). The top predictive features for essentiality of FLT3 (self-expression) and ATPV6V0E1 (paralog expression) did not differ between DEPMAP and TCGADEPMAP, yet the distribution and prevalence of strong dependency scores varied across lineages between patients and cell lines (Extended Data Fig. 2a–d). Likewise, the dependency on PTPN11 (SHP2) was noticeably more selective in TCGADEPMAP than DEPMAP (Fig. 2j,k), which was reflected by greater essentiality in a subset of patients with breast cancer (BRCA) (Extended Data Fig. 2e) that was absent from BRCA cell lines (Extended Data Fig. 2f). A Fisher’s exact test of the genetic drivers that were enriched in TCGADEPMAP patients with BRCA that were most dependent on PTPN11 included TP53 mutations and HER2/ERBB2 amplifications (Extended Data Fig. 2g), whereas FAT3 deletions and GATA3 mutations were depleted in these patients (Extended Data Fig. 2h). Particularly in the case of HER2, which signals through SHP2 and the RAS pathway, these data fit with the observation that RAS pathway inhibition, including SHP2 inhibitors, are more potent in the three-dimensional (3D) versus two-dimensional (2D) context30,31. Thus, the presence of TCGADEPMAP patients with BRCA that were highly dependent on PTPN11 is likely due to the 3D context of patient tumors, whereas DEPMAP BRCA cell lines with similar genetic drivers are not PTPN11 dependent due to the 2D context of cultured cells. Collectively, these data demonstrate that identifying SSDs can be impacted by different prevalence and distributions of the underlying drivers in patients and cell models, which can be overcome by patient-relevant dependency maps, such as TCGADEPMAP.Clinical phenotypes and outcomes in TCGADEPMAP
Another strength of translational tumor dependency maps is the ability to assess the impact of gene essentiality on clinically relevant phenotypes, such as molecular subtyping, therapeutic response and patient outcomes. To evaluate the utility of TCGADEPMAP for therapy-relevant patient stratification, an unsupervised clustering of the 100 most variable gene dependencies was performed using the TCGADEPMAP BRCA cohort (Fig. 3a). The 100-dependency signature (DEP100) performed comparably to the established PAM50 signature32 in classifying BRCA subtypes (AUC > 0.8 for most subtypes), despite only three overlapping genes between PAM50 and DEP100 (Fig. 3b). Dependency subtyping with DEP100 predicted significantly higher ESR1 essentiality in ER-positive tumors (Fig. 3c) and higher HER2 essentiality in HER2-amplified tumors (Fig. 3d). Finally, due to the limited accessibility of therapeutic response data in TCGA33, we identified nine clinical datasets for molecular therapeutics of tumor dependencies for which we had accurate models and sufficient statistical power34,35,36. Of these nine datasets, we found seven out of nine dependency models significantly predicted clinical responses and performed better or comparable to the target gene expression in predicting therapeutic responses (Fig. 3e–h and Supplementary Table 8). Of the two nonsignificant datasets, both trended in the correct direction and would likely reach statistical significance with larger cohort sizes. Taken together, these data establish the physiological relevance of TCGADEPMAP to associate dependencies with common clinicopathological features, such as molecular subtyping and therapeutic response.Fig. 3: Translating TCGADEPMAP to clinically relevant phenotypes and outcomes.a, Unsupervised clustering of the top 100 dependencies in TCGA breast cancer patients. b, A ROC–AUC analysis was used to test the accuracy of calling breast cancer subtypes using the top 100 dependencies. c, ESR1 dependencies are strongest in ER-positive luminal BRCA (n = 96 for basal-like, n = 57 for HER2+, n = 231 for luminal A, n = 126 for luminal B and n = 7 for normal-like). d, HER2 dependencies are strongest in HER2-amplified BRCA (n = 96 for basal-like, n = 57 for HER2+, n = 231 for luminal A, n = 126 for luminal B and n = 7 for normal-like) e, HER2 dependency predicts trastuzumab response in patients with BRCA (n = 6 for no response, n = 33 for partial response and n = 9 for complete response). f, BRAF dependency predicts sorafenib response in patients with hepatocellular cancer (n = 46 for non-responder and n = 21 for responder). g, EGFR dependency predicts cetuximab response in patients with head and neck cancer (n = 26 for non-responder and n = 14 for responder). For c–g, *P < 0.05, **P < 0.01 and ***P < 0.001, as determined by the Wilcoxon rank-sum test for two-group comparison and Kruskal–Wallis test followed by a Wilcoxon rank-sum test with multiple test correction for the multi-group comparison. For boxplots in c–g, the center horizontal line represents the median (50th percentile) value. The box spans from the 25th to the 75th percentile. The whiskers indicate the 5th and 95th percentiles. h, AUC values for drug response predictions based on essentiality, expression and random essentiality scores generated via random sampling (control). i, Top gene essentialities associated with the PFI by univariate Cox proportional hazard regression model across multiple lineages in TCGADEPMAP (Benjamini–Hochberg, FDR < 0.2). j, HRs of the top essentialities across TCGADEPMAP. Blue indicates a greater dependency associated with worse outcome and red indicates a greater dependency is associated with better outcome. P values and HRs are shown in Supplementary Table 9.Source dataThe ability to associate gene essentiality with patient survival is a unique strength of TCGADEPMAP, which is not accessible using cell-based dependency maps. Moreover, outcomes driven by perturbations of oncogenic pathways and genetic drivers of human cancers are likely not captured by gene expression alone and rather require a readout of gene essentiality. To test this possibility, the cross-validated gene essentiality models (n = 1,966) were tested for association with the progression-free interval (PFI) in TCGADEPMAP. Among 29 cancer lineages that are well powered for PFI analysis33, 105 known genetic drivers of human cancer were significantly associated with the PFI of TCGA patients (Supplementary Table 9), including 29 that were prognostic in at least four cancer lineages (Fig. 3i,j). For example, a stronger dependency on the druggable oncogene, STAT3 (ref. 35), was significantly associated with a shortened time to disease progression of six different cancers (Fig. 3i,j). Likewise, multiple other prevalent genetic drivers of human malignancies were associated with a significantly shorter PFI, including PAX5 and PDGFRA (Fig. 3i,j). Both proteins have been investigated previously as prognostic indicators of poor outcomes by expression analysis in patient biopsies37,38 and this study shows that dependency on these oncogenes is associated with worse outcome in patients using a translational dependency map.Synthetic lethalities in TCGADEPMAP
In addition to illuminating lineage and oncogenic dependencies, the DEPMAP has dramatically expanded the list of potential synthetic lethalities (the loss of a gene sensitizes tumor cells to inhibition of a functionally redundant gene within the same pathway)6,16,17,39,40; however, one of the current limitations of the DEPMAP is that the available cancer cell models do not yet fully recapitulate the genetic and molecular diversity of TCGA patients25. Thus, we assessed the landscape of predicted synthetic lethalities with loss-of-function (LOF) events (damaging mutations or deletions) in TCGADEPMAP. Lasso regression analysis of gene essentiality profiles and 25,026 LOF events detected in TCGADEPMAP yielded 633,232 synthetic lethal candidates (FDR < 0.01) (all candidates added as an R object to a figshare repository), which were too numerous to experimentally validate by current methods. To prioritize the synthetic lethal candidates, the gene interaction scores were correlated with the mutual exclusivity of corresponding mutations in TCGADEPMAP, which narrowed the list to 28,609 candidates (FDR < 0.01). Multiple additional criteria were then applied to refine the list further by enriching for predicted paralogs with close phylogenic distance to prioritize candidates with redundant functions due to sequence homology. All told, this approach identified many known synthetic lethal pairs (for example, STAG1/STAG2, SMARCA2/SMARCA4 and EP300/CREBBP)41,42,43 and previously untested synthetic lethal candidates, demonstrating that TCGADEPMAP is well powered to predict synthetic lethal relationships with LOF events in patient tumor biopsies (Extended Data Fig. 3a–d and Supplementary Table 10).Synthetic lethalities that were predicted with LOF events in the TCGADEPMAP (n = 604 pairs) were experimentally tested using a multiplexed CRISPR/AsCas12a screening approach across representative cell models of five cancer lineages (Fig. 4a,b). Additional pairs (n = 261 controls) were added to the library to control for screen performance, including essential paralog pairs and nonessential pairs of tumor suppressor genes (TSGs) and interacting partners (Supplementary Table 10). An initial pilot screen was performed using five cancer cell models, which experimentally validated 69 TCGADEPMAP synthetic lethalities in at least one representative cell model (Supplementary Table 11). As these data were being generated, an enhanced AsCas12a (enAsCas12a) enzyme was reported to be compatible with CRISPR/AsCas12a libraries44, enabling replication of the initial pilot screens and expansion to a total of 16 cancer cell models. Notably, the replication of the initial screens was highly concordant across the five cell models in common (average r = 0.69) (Extended Data Fig. 3e–i), as well as detection of increased depletion of essential controls and synthetic lethal partners compared to nonessential controls (Fig. 4c). In addition to novel pairs, multiple previously reported synthetic lethalities (HSP90AA1/HSP90AB1 (ref. 45), DDX19A/DDX19B45, HDAC1/HDAC2 (refs. 45,46), SMARCA2/SMARCA4 (refs. 45,46), EP300/CREBBP43, STAG1/STAG2 (refs. 42,46) and CNOT7/8 (ref. 47)) were replicated across multiple cell lines in both cohorts (Supplementary Table 11), demonstrating the robustness of the multiplex CRISPR/Cas12a screening platform to test synthetic lethalities. Notably, as observed elsewhere39,41,46, the sensitivity to synthetic lethalities varied between cell models and lineages, implicating the prevalence of unknown modifiers of synthetic lethality that manifest in different cellular contexts and are yet to be fully understood.Fig. 4: Using TCGADEPMAP to translate synthetic lethalities in human cancer.a, Schematic of the CRISPR/Cas12 library multiplexed guide arrays targeting one or two genes per array. b, Schematic of the synthetic lethality screening approach using the CRISPR/Cas12 library. All CRISPR screens were performed as n = 3 biological replicates per cell line. c, Violin plots of target-level CRISPR of the average log2 fold change (FC) across all tested cell lines for nontargeting (NT) guide (neg CTRL), single knockout guides targeting essential genes (single KO CTRL), DKO guides targeting essential genes (DKO CTRL), single knockout guides of TCGADEPMAP candidates (single KO) and DKO guides of TCGADEPMAP candidates (DKO). d, Rank plot of target-level gene interaction (GI) scores averaged across n = 14 cell lines in the CRISPR/Cas12 multiplexed screening (A549, DETROIT562, FADU, H1299, H1703, HCT116, HSC2, HSC3, HT29, MDAMB231, MIAPACA2, PANC1, PC3M and SNU1), including the top five synthetic lethalities (table insert). The black line indicates the mean and gray error bars show ±s.e.m. e, Distribution of synthetic lethal candidates from TCGADEPMAP with experimental evidence of synthetic lethality in the CRISPR/Cas12 multiplexed screening across 14 cancer cell lines. A blue box indicates a GI score < −2. f,g, Cell viability assessed by CellTiterGlo (CTG) luminescence at 7 days after single (KO) or dual (DKO) CNOT7/CNOT8 knockouts, normalized to NT controls in five cell lines grown in 2D monolayers (f) or 3D spheroids (g); n = 3 biological replicates per cell model per condition with the exception of n = 5 biological replicates for Hs578T grown in 2D monolayer. Error bars are mean ± s.d. h, Crystal violet staining of CNOT7−/− clones C1 and C2 stably expressing nontargeting (sgNT) or CNOT8-targeting (sgCNOT8) dox-inducible guide constructs, following 7 days of dox treatment (Methods). i, Tumor xenograft studies of HT29 clones grown in mice fed dox-containing food from day 0 (gray and green lines) or beginning on day 19 (blue lines). n = 5 mice per group. Error bars are ±s.d. Asterisks in f, g and i reflect two-tailed, unpaired Student’s t-test P values; *P < 0.05; **P < 0.01; ***P < 0.001.Source dataOf the 604 synthetic lethalities predicted by TCGADEPMAP, a total of 78 (13%) were experimentally validated in at least one representative cell model (Fig. 4d,e and Supplementary Table 11). For example, double knockout (DKO) of CNOT7/8 was synthetic lethal in 11 out of 14 cell lines that were screened (Fig. 4e) and was orthogonally validated in five cell models by DKO using ribonucleoprotein (RNP) in both 2D monolayer and 3D spheroid assays (Fig. 4f,g). Likewise, doxycycline (dox)-inducible loss of CNOT8 was synthetic lethal in HT29 cells that lacked CNOT7 in both in vitro 2D monolayers (Fig. 4h) and in vivo mouse xenograft studies (Fig. 4i). Notably, loss of CNOT7 in single knockout (KO) cells coincided with elevated CNOT8 protein (Extended Data Fig. 3j), fitting with previous observations that loss of CNOT7 increases integration of CNOT8 into the CCR4–NOT complex48. Likewise, CNOT8 protein levels were inversely correlated with CNOT7 copy numbers in patients with lung adenocarcinoma (LUAD) and BRCA in the NCI Clinical Proteomic Tumor Analysis Consortium cohort (Extended Data Fig. 3k). Collectively, these observations demonstrate the power of TCGADEPMAP to detect patient-relevant synthetic lethal mechanisms, which can be orthogonally validated and provide therapeutic targets for drug discovery.Another discovery using TCGADEPMAP was the prediction of PAPSS1 synthetic lethality with deletion of PAPSS2 and the neighboring tumor suppressor, PTEN, which were frequently co-deleted in TCGA patient tumors (43% co-incidence) yet were largely unaffected in cancer cell lines (Extended Data Fig. 4a–g). PAPSS1/PAPSS2 are functionally redundant enzymes essential for synthesis of 3′-phosphoadenosine 5′-phosphosulfate (PAPS), which is required for all sulfonation reactions49, suggesting that loss of PAPSS1/PAPSS2 is synthetic lethal due to the inability to sulfonate proteins. To test this hypothesis, PAPSS1/PAPSS2 were targeted in H1299 spheroids by RNP, followed by measurement of spheroid growth and sulfonation levels of heparan sulfate (HS) proteoglycan (HSPG) chains on the cell surface by flow cytometry. Confirming the CRISPR/Cas12 screen data (Fig. 5a), dual loss of PAPSS1 and PAPSS2 significantly reduced H1299 spheroid growth compared to controls (Fig. 5b and Extended Data Fig. 4h,i), which coincided with loss of HSPG sulfonation (Fig. 5c). Likewise, targeting PAPSS1 by RNP in UMUC3 cells, which endogenously lack PAPSS2 and PTEN, also significantly depleted HSPG sulfonation and coincided with significant spheroid growth reduction, which could be rescued by addition of exogenous heparan sulfate (Fig. 5d and Extended Data Fig. 4h,j). Finally, PAPSS1/PAPSS2 synthetic lethality was confirmed in vivo, as demonstrated by a significant tumor growth reduction of UMUC3 tumors without PAPSS1 and PAPSS2 compared to control tumors lacking only PAPSS2 (Fig. 5e and Extended Data Fig. 4k). Taken together, these data demonstrate that translational dependency maps, such as the TCGADEPMAP are powerful tools to uncover previously underrepresented synthetic interactions in cancer models that are likely to be patient relevant.Fig. 5: PAPSS1 and PAPSS2 are novel synthetic lethal paralogs detected by TCGADEPMAP.a, Rank plot of target-level GI scores in H1299 cells, including the top ten synthetic lethalities (table insert). The novel synthetic lethality, PAPSS1/PAPSS2, is highlighted in blue. All CRISPR screens were performed as n = 3 biological replicates per cell line. b, Spheroid size of H1299 cells with single or dual PAPSS1 and PAPSS2 knockouts, normalized to NT control spheroids; n = 4 biological replicates per condition. Data show mean ± s.d. *P < 0.05 and **P < 0.01 as per unpaired, two-tailed t-test. c, Flow cytometry histogram overlay plots of viable H1299 and UMUC3 cells (DAPI−) showing expression of cell surface sulfonated HSPGs as measured by antibody clone 10E4-FITC. Dual loss of PAPSS1/PAPSS2 leads to total loss of sulfonation comparable to heparinase III treatment (HepIII*) which specifically cleaves sulfonated HS chains. d, Growth defects of UMUC3 spheroids following deletion of PAPSS1 (yellow bars) were partially rescued by the addition of 10 μg ml−1 and 50 μg ml−1 of exogenous HS as compared to NT control spheroids (green bars); n = 4 biological replicates for the untreated control and n = 3 biological replicates per treated condition. Data are mean ± s.d. *P < 0.05 as per unpaired, two-tailed t-test. e, Diagram showing tumor volumes over time (d, days) after in vivo implantation of 1 × 106 UMUC3 NT or PAPSS1-KO cells in SCID/beige mice. Each dot represents an individual mouse (n = 5 mice per condition); ***P < 0.001, as determined by unpaired, two-tailed t-test of the final data point. f, Kaplan–Meier plot of TCGADEPMAP patients with a predicted PAPSS1/PAPSS2 synthetic lethality has a worse outcome compared to the rest of the cohort, as determined by a Cox log-rank test. DAPI, 4,6-diamidino-2-phenylindole.Source dataTCGADEPMAP is unique in its ability to uncover potential synthetic lethalities that can be related to patient outcomes, enabling the prioritization of the experimentally validated synthetic lethalities that correlate with the worst outcome and therefore likely to have the greatest clinical impact if druggable. To test this possibility, a Cox log-rank test was used to assess overall survival (OS) of TCGA patients who correlated with predicted gene essentiality by TCGADEPMAP and LOF events (mutation, deletion or both) of the putative synthetic lethal partner. After controlling for tumor lineage, PAPSS1 dependency in TCGADEPMAP was correlated with significantly worse OS (hazard ratio (HR) = 0.61, P = 0.0004) in patients with PAPSS2 deletion (Fig. 5f), demonstrating that PAPSS1 is a synthetic lethality target with potentially high translational impact. Collectively, these data demonstrate that translational dependency maps can enable the discovery, validation and translation of synthetic lethalities.Constructing PDXEDEPMAP
In addition to building TCGADEPMAP, a similar approach was applied to generating an orthogonal translational dependency map using the PDX Encyclopedia (PDXEDEPMAP)50. As outlined in Fig. 6a, PDXEDEPMAP was assembled by transferring the cross-validated 1,966 expression-only models from the DEPMAP to the PDXE (n = 191 tumors) using the aligned genome-wide expression profiles from the PDXE (Supplementary Table 12). Unsupervised clustering of gene essentialities across five well-represented lineages in PDXEDEPMAP confirmed that lineage is a key driver of gene dependencies (Fig. 6b), fitting with the observations made in TCGADEPMAP (Fig. 2e). PDXEDEPMAP also detected markedly stronger KRAS essentiality in KRAS-mutant PDX of pancreatic ductal carcinoma (PDAC) and colorectal carcinoma (CRC) lineages (Fig. 6c,d), whereas BRAF essentiality was strongest in BRAF-mutant PDX of cutaneous melanoma (CM) (Fig. 6e,f). These data collectively demonstrate that the PDXEDEPMAP performed comparably to TCGADEPMAP and is well powered to detect gene essentiality signals in PDX models.Fig. 6: Building a translational dependency map in patient-derived xenografts: PDXEDEPMAP.a, Schematic of gene essentiality model transposition from DEPMAP to PDXE, following alignment of genome-wide expression data to account for differences in homogeneous cultured cell lines and PDX samples with contaminating stroma. b, Unsupervised clustering of predicted gene essentiality scores across five lineages in PDXEDEPMAP confirmed similar lineage drivers of gene dependencies, as observed in TCGADEPMAP. Blue indicates genes with stronger essentiality and red indicates genes with less essentiality. c, KRAS dependency was enriched in PDXEDEPMAP lineages with high frequency of KRAS GOF mutations, including CRC and PDAC. n = 43 for BRCA, n = 51 for CRC, n = 27 for NSCLC, n = 39 for PDAC and n = 32 for CM. d, KRAS essentiality correlated with KRAS mutations in all PDXEDEPMAP lineages (n = 74 for KRASmut and n = 117 for KRASwt). e, BRAF dependency in PDXEDEPMAP was enriched in CM, which has a high frequency of GOF mutations in BRAF. n = 43 for BRCA, n = 51 for CRC, n = 27 for NSCLC, n = 39 for PDAC and n = 32 for CM. f, BRAF essentiality correlated with BRAF mutations in all TCGADEPMAP lineages (n = 32 for BRAFmut and n = 159 for BRAFwt). For c–f, the center horizontal line represents the median (50th percentile) value. The box spans from the 25th to the 75th percentile. The whiskers indicate the fifth and 95th percentiles. g, Top correlated gene essentiality models that correlate with PDX response to erlotinib in PDXEDEPMAP. h, Top correlated gene essentiality models that correlate with PDX response to cetuximab in PDXEDEPMAP. ***P < 0.001, as determined by the Wilcoxon rank-sum test for two-group comparison (d and f) and Kruskal–Wallis test followed by a Wilcoxon rank-sum test with multiple test correction for a multi-group comparison (c and e). NSCLC, non-small cell lung cancer.Source dataIn addition to orthogonal validation of TCGADEPMAP, a unique strength of PDXEDEPMAP is the ability to assess gene essentiality in the context of therapeutic responses across five cancer lineages and 15 molecular therapies50. To test the ability of gene essentiality to predict the response to corresponding targeted therapies, the change in PDX burden from baseline to experimental end point was correlated with target gene essentiality. This revealed that 80% of drugs (12 of 15) were significantly correlated (P < 0.05) with the predicted essentiality of the target gene (Supplementary Table 13). For example, trastuzumab response in the PDXEDEPMAP was strongly predicted by HER2 dependency (R = 0.4849, P = 0.002, AUC = 0.75), in line with the predictive power of HER2 dependency on trastuzumab responsiveness in patients with HER2-amplified BRCA (Fig. 3e). Other examples, such as erlotinib (R = 0.4937, P = 0.01, AUC = 0.78) and cetuximab (R = 0.2293, P = 0.06, AUC = 0.83), which target the same gene (EGFR), provide the opportunity to explore dependency mechanisms of therapeutic resistance across modalities. Comparisons of PDX responses to erlotinib or cetuximab revealed dependencies within two common pathways: the SWI/SNF complex (SMARCA2 and SMARCD1) and protein trafficking (EMC4, EMC6, VPS39 and MAPK14) (Fig. 6g,h). Notably, components of both pathways have been implicated in resistance to EGFR inhibitors51,52, suggesting that targeting these dependencies would likely improve patient outcomes. Taken together, these data demonstrate the ability of gene essentiality to predict therapeutic response and highlight the translatability of PDX modeling to patient-relevant clinical outcomes.Translating gene tolerability in GTEXDEPMAP
A final objective of this study was to define gene essentiality in the context of healthy tissues, which would provide a resource for prioritizing tumor dependencies with the best predicted tolerability. To achieve this objective, the expression-based dependency models from DEPMAP were transposed using the aligned expression data from GTEX (GTEXDEPMAP), a compendium of deeply phenotyped normal tissues collected from postmortem healthy donors (n = 948)28 (Fig. 7a and Supplementary Table 14). To assess the sensitivity of GTEXDEPMAP to dependencies with low tolerability, the molecular targets of drugs with reported toxicities in the liver and blood (n = 241) were compared across GTEXDEPMAP (Supplementary Table 15). This revealed that the average essentiality was higher in liver and blood than other normal tissues (Fig. 7b). Likewise, unsupervised clustering of the 1,966 cross-validated gene essentiality models revealed strong tissue-of-origin dependencies in healthy organs (Fig. 7c), suggesting that tissue-specific biological context also contributes to gene essentiality in normal physiological settings. Taken together, these data demonstrate that GTEXDEPMAP is sensitive to known toxicities, which cluster around different healthy organ types.Fig. 7: Building a translational dependency map in normal tissues: GTEXDEPMAP.a, Schematic of gene essentiality model transposition from DEPMAP to GTEX, following alignment of genome-wide expression data to account for differences in homogeneous cultured cell lines and healthy tissue biopsies. b, Average gene essentiality profile across healthy tissues of GTEXDEPMAP (n = 17,382) for molecular targets with known liver and blood toxicities (in blue). c, Unsupervised clustering of predicted gene essentiality scores across healthy tissues. Blue indicates genes with stronger essentiality and red indicates genes with less essentiality. d, KRAS essentiality is significantly higher in PAAD with GOF mutations compared to healthy pancreas in GTEXDEPMAP (n = 146 for cancer with n = 106 KRASmut and n = 40 KRASwt, n = 328 for normal) e, BRAF essentiality is significantly higher in SKCM with GOF mutations compared to normal skin GTEXDEPMAP (n = 319 for cancer with n = 165 BRAFmut and n = 154 BRAFwt, n = 1,809 for normal) For b, d, and e, the center horizontal line represents the median (50th percentile) value. The box spans from the 25th to the 75th percentile. The whiskers indicate the fifth and 95th percentiles. f, Global differences between the predicted target efficacy score (TCGADEPMAP) and the healthy tissue-of-origin tolerability score (GTEXDEPMAP). g, STRING network analysis of the top 100 LUAD targets with the greatest predicted tolerability in healthy lung reveals significant connectivity (P < 1 × 10−16) and gene ontology enrichment oxidative phosphorylation (blue-colored spheres; P = 5.8 × 10−11) and mitochondrial translation (red-colored spheres; P = 2.9 × 10−20). ***P < 0.001, as determined by a Wilcoxon rank-sum test for two-group comparison and Kruskal–Wallis test followed by a Wilcoxon rank-sum test with multiple test correction for a multi-group comparison (d and e).Source dataComparing essentiality scores of known druggable oncogenes in TCGADEPMAP with GTEXDEPMAP revealed greater dependency in malignant tissues versus a healthy tissue of origin. For example, KRAS and BRAF essentialities seem to be concomitantly dependent on lineage and genetic drivers, as the healthy tissues of origin were predicted to be significantly less affected in the GTEXDEPMAP compared to TCGADEPMAP (Fig. 7d,e). Likewise, similar observations were made for other oncogenic drivers that are approved therapeutic targets in patients with cancer, such as HER2-amplified BRCA (Extended Data Fig. 5a). In contrast, there was markedly less separation in the predicted essentialities of malignant tumors and healthy tissues of origin for molecular therapies that have yet to be successful in clinical trials (Supplementary Table 16). To refine the list of oncogenic pathways with significant differences in tumor efficacy and healthy tissue-of-origin tolerability, we compared dependency (TCGADEPMAP) and tolerability (GTEXDEPMAP) scores across all genes and tissues (Fig. 7f). Pathway analysis of the strongest tumor dependencies with the least tissue-of-origin toxicity revealed enrichment of multiple oncogenic pathways and pathophysiological processes (Supplementary Table 17), including dysregulation of oxidative phosphorylation (P = 5.8 × 10−11) and mitochondrial translation (P = 2.9 × 10−20) pathways that were enriched in LUAD compared to healthy lung (Fig. 7g and Extended Data Fig. 5b). Combined, these observations suggest that predicted gene essentiality in the context of a driver mutation and correspondingly low essentiality within the healthy tissue of origin is likely to identify efficacious drug targets with acceptable tolerability.Tool for visualizing translational dependenciesTo enable visualization of the data, we have provided an interactive web-based application (https://xushiabbvie.shinyapps.io/TDtool/) for exploring the data within TCGADEPMAP, PDXEDEPMAP and GTEXDEPMAP.

Hot Topics

Related Articles