The prominent pervasive oncogenic role and tissue specific permissiveness of RAS gene mutations

Data mining of CRISPR effect scores in DepMap showed prevalent difference in RAS genes between mutant lines vs. WT lines but also revealed drawbacks of individual assessment of genes of interestRAS mutations are commonly considered as oncogenic drivers in Lung (LUAD), Colon (COAD), and Pancreas (PAAD) tumors, we are fascinated by the idea that very likely they may be critical oncogenic drivers in additional tumor types. To evaluate the potential oncogenic role of mutated KRAS in a much broader spectrum of tissue types we evaluated CRISPR effect scores from the DepMap database. The CRISPR effect scores within the DepMap database’s CRISPR screening data offer insights into the dependency levels on corresponding genes within cell lines.RAS genes rarely exhibit the most negative CRISPR effect scores (data not shown). This finding is consistent with the fact that the CRISPR effect score primarily reflects a gene’s ability to impact cell survival and proliferation, rather than its presumed oncogenic role. However, considering the context of oncogenesis, it is the genomic changes, particularly mutations, occurring in oncogenic genes, distinguishing them from other essential genes without such functions.If RAS mutations confer a survival and growth advantage in cell lines, indicative of their oncogenic addiction nature, we hypothesized that RAS mutant cell lines would exhibit a greater dependency on these RAS genes (more negative CRIPSR effect scores) compared to RAS wild-type (WT) lines. Consequently, our primary focus was to assess whether the KRAS gene consistently displayed more negative CRISPR effect scores in KRAS mutant lines, as compared to WT lines, across various tissue types, rather than emphasizing the absolute negativity of the CRISPR effect scores for KRAS genes in these cell lines.We initially focused on analyzing the CRISPR effect scores of the KRAS gene. Encouragingly, we observed a clear negative shift in the CRISPR effect score distribution of KRAS mutants compared to WT lines. Furthermore, we identified a significant difference in scores between KRAS mutant and WT lines across all tissue types that had sufficient sample sizes (Supplementary Fig. 1). According to the DepMap database, a more negative value of the CRISPR effect score for a gene indicates a higher dependency of the cell line on that specific gene10,11,12. Remarkably, our findings revealed that KRAS mutant lines displayed greater dependency on the KRAS gene than their wild-type counterparts across all tissue types. This dependency was evident not only in the tissue types conventionally associated with RAS genes, such as lung, colon, and pancreas, but also in numerous other tissue types examined.We further extended our comparison to encompass all RAS mutant lines (including those with KRAS, NRAS, and HRAS mutations) versus WT lines, significantly broadening the tissue types included in the comparison. Interestingly, we consistently observed a negative shift in the distribution of KRAS CRISPR effect scores in RAS mutant lines compared to WT lines across many tissue types. However, we did observe instances in certain tissue types where the RAS mutant and WT lines exhibited little or no differences in the distribution of their KRAS scores (highlighted in red rectangles, Supplementary Fig. 2A). This observation was expected, particularly in tissue types like skin, where NRAS, not KRAS, is recognized as one of the main oncogenic players in skin cancer, such as melanoma, based on observed mutation patterns in cancer patients1.Therefore, to better understand the role of RAS genes in these tissue types, we examined the CRISPR effect scores of the NRAS gene in comparison to RAS mutant versus WT lines within the same set of tissue types (Supplementary Fig. 2B). As anticipated, we observed a clear negative shift in the distribution of CRISPR effect scores of NRAS gene in RAS mutant lines compared to WT lines in tissue types that exhibited little or no differences in the KRAS scores (highlighted in red rectangles, Supplementary Fig. 2B). This finding indicated that in these tissue types, NRAS may compensate for the role of KRAS in oncogenesis, and NRAS, rather than KRAS, could play a major role in driving oncogenesis. These observations align with expectations based on the existing literature, which has reported the prevalence of RAS gene mutations in these tissue types1,2,7,8,9.To evaluate statistical differences, we employed native t-tests, a commonly used method. For the KRAS gene, our analysis revealed that the majority of tissue types exhibiting dissimilar score distributions (Supplementary Fig. 2A) also demonstrated significant t-test p-values (red arrows, Supplementary Fig. 2A). Conversely, certain tumor types appeared visually distinct but did not yield significant t-test p-values (green arrows, Supplementary Fig. 2A). In contrast, when examining the NRAS gene (Supplementary Fig. 2B), we observed that numerous tumor types did not achieve significant t-test p-values (green arrows, Supplementary Fig. 2B), despite apparent trends.These observations underscore an important consideration: the data analyzed in our study were generated using high-throughput technology. With the advancements in this field and the availability of specific statistical analysis methods tailored for high-throughput data, it becomes imperative to leverage the entire dataset rather than focusing solely on individual genes, as we did in this study. Employing a systematic approach that simultaneously analyzes all genes and data using statistically robust methods designed for high-throughput data is warranted. By adopting such an approach, we not only reduce the risk of overlooking relevant findings but also enhance the statistical power of our analysis. As described later in this manuscript, our efforts in this direction led to intriguing insights.Systematic analysis of high-throughput CRISPR effect data of DepMap revealed the prominent pervasive oncogenic role of RAS gene mutations and implication of tissue-specific permissiveness of mutant K- or N-RAS oncogenesis in a wide spectrum of tumor typesTo perform a comprehensive analysis of the genome-wide (~ 17k genes) high-throughput CRISPR screening data from nearly 1000 cell lines, we opted to utilize the widely recognized linear model-based method called limma20. Limma, originally designed for high-throughput microarray data analysis, offers superior statistical power compared to the native t-test for high-throughput datasets by leveraging information from within-group replicates and borrowing information across genes20. This ensures that limma can effectively identify more significant findings from the CRISPR effect scores of CRISPR screening data obtained from the DepMap database.Before proceeding with the limma analysis, we conducted an initial data assessment to ensure the suitability of applying the limma method to the genome-wide CRISPR gene effect score data from the DepMap database (Supplementary Fig. 3). Based on the robustness and flexibility of data distribution and types that limma method can be applicable to, the assessment confirmed that the CRISPR effect scores from DepMap are appropriate for analysis using the limma method.Utilizing the limma method, we conducted an initial investigation to examine the differences in CRIPSR effect scores of genome-wide genes between KRAS mutants and WT lines. The results were presented visually using volcano plots, which effectively illustrate gene-level statistics (Supplementary Figs. 4 and 5).As expected, KRAS emerged as the most significant gene, displaying a substantial difference at the adjusted p-value level, obtained through multiple testing (Supplementary Fig. 4). Additionally, KRAS exhibited the most pronounced negative logFC in comparisons between KRAS mutants and WT lines originating from lung and pancreas tissues (Supplementary Fig. 4). Consistent patterns were observed across other tumor types, with KRAS consistently identified as the most differential gene and the only gene exhibiting significant adjusted p-values through multiple testing. The only exception was noted in the Haemato_and_lymph (haematopoietic_and_lymphoid) category, where KRAS was identified as nearly the most differentially expressed gene (Supplementary Fig. 5). Overall, the limma analysis consistently ranked KRAS as one of the top one or two genes across almost all tumor types with sufficient samples (Supplementary Table 1).To evaluate how other mutated genes behave in comparison to RAS genes, we did perform exhaustive computational screening for all qualified mutated genes, including commonly known oncogenic driver genes such as PIK3CA and BRAF as we did differential analysis for KRAS mutants vs. WT contrast, which confirmed that the results of KRAS mutant vs. WT contrasts were highly likely to be driven by the underlying RAS biology, not by random chance, and will be described in the next Results section.Furthermore, we observed instances where native t-test p-values were not significant, but the adjusted p-values using the limma method yielded significance. This observation further highlights the advantages of employing high-throughput data analysis methods. Similar cases were also encountered during the comparison of RAS mutant and WT lines, as described in the subsequent analysis. This underscores the robustness and reliability of the limma method in detecting meaningful differences and potential insights that might be overlooked by traditional statistical approaches.After obtaining highly significant results in the KRAS mutant versus WT contrast, we proceeded to investigate whether these observations would also hold for the RAS mutant versus WT contrast. RAS mutants encompassed all KRAS, NRAS, and HRAS mutant lines. Interestingly, in a similar analysis comparing RAS mutants to WT lines, the KRAS gene once again emerged as the most significantly differential gene in lung, colon, and pancreas (panel A of Fig. 1), which is consistent with the well-established role of KRAS as an oncogenic driver in these tissue types. With lung as an example (top left in panel A of Fig. 1), we observed that KRAS gene exhibited the most negative logFC and achieved the most significant adjusted p-values (< 0.05), suggesting that KRAS likely acts as an essential gene for RAS mutant lung cell lines (addicted to or dependent on as oncogenic addiction). Additionally, two other genes, PTPN11 and GRB2, were identified as significantly differential genes (adjusted p-values < 0.05) with positive logFC in the volcano plot, indicating their potential essentiality for RAS WT lung cell lines (top left in Panel A, Fig. 1). Notably, KRAS and these two genes are well-known critical components of the RAS pathway, as annotated by the RAS Initiative25 (Supplementary Fig. 6). These observations suggest that both RAS mutant and WT lines depend on RAS pathway genes as essential components.Figure 1KRAS or NRAS was derived as the most or most nearly significantly differential gene for CRISPR gene effect at adjusted p-value < 0.05 with the most or nearly most negative dependency difference between RAS mutant vs. WT lines from corresponding subsets of tissue types. Volcano plots of all genes for CRISPR effect score data of DepMap showed KRAS as the most significantly or nearly the most significantly differential gene between RAS mutant vs. WT lines from a subset of tissue types including lung, colon, breast, pancreas, ovary, biliary tract (A) or NRAS as the most significantly or nearly the most significantly differential gene between RAS mutant vs. WT lines from another subset of tissue types including Autonomic Ganglia, Liver, CNS, Skin, Soft Tissue, and Haemato_Lymph (abbreviation for HAEMATOPOIETIC_AND_LYMPHOID_TISSUE) (B). Green data points: genes with significant adjusted p-value (< 0.05) for multiple testing and logFC < 0; orange data points: genes with significant adjusted p-value (< 0.05) for multiple testing and logFC > = 0; red data points: genes with significant raw p-value (< 0.05); black data points: genes without statistical significance. The parentheses after “Mut” or “WT” indicate number of mutant lines or number of WT lines, respectively. Note: limma model is set up on the whole dataset including all tumor types, and so all data is under the same roof of the limma model, by which power of the analysis was essentially increased as described earlier20. X-axis logFC: log2 fold change as for the actual difference of the CRISPR effect scores between RAS mutant vs. WT lines in volcano plots, since the values of CRISPR effect scores inherently in logarithm transformed scale were used directly in limma; y-axis –log10(p.Value): (-1)*log10 of raw p-value of limma analysis. Colon: LARGE_INTESTINE.However, what truly piqued our interest was the revelation that in numerous other tissue types, KRAS was also identified as the most or nearly the most significantly differential gene (adjusted p-values < 0.05) (Panel A, Fig. 1) beyond the main tissue types such as lung, colon and pancreas that RAS biology researchers commonly focus on. Intriguingly, for another distinct subset of tissue types, NRAS emerged as the most or nearly the most significantly differential gene (adjusted p-values < 0.05) (Panel B, Fig. 1).Upon closer examination of the RAS mutation status in those tissues where NRAS was identified as the top differential gene, it became evident that most RAS mutant lines in these tissue types were, indeed, NRAS mutants. To corroborate these findings, we conducted differential gene analyses by directly comparing NRAS mutant versus wild-type (WT) lines across all examined tissue types, which yielded consistent results (see Supplementary Fig. 7). Taken together, these observations further support the prominent pervasive oncogenic role of RAS mutations.Notably, in contrasts of NRAS vs. WT lines, lung tissue also featured NRAS as the most differential gene (Supplementary Fig. 7), despite the contrasts of KRAS mutant versus WT and RAS mutant versus WT lines consistently highlighting KRAS as the most differential gene in lung tissue (Supplementary Fig. 4 and Fig. 1). We also observed similar behavior from haematopoietic_and_lymphoid tissue type in that NRAS was derived as top differential gene in contrasts of NRAS versus WT (Supplementary Fig. 7) and RAS mutant versus WT lines (Fig. 1B), whereas KRAS as top differential gene in contrast of KRAS vs. WT lines (Supplementary Fig. 5). Given the fact that both lung and haematopoietic_and_lymphoid tissue types have sufficient numbers of mutated cell lines of either KRAS or NRAS, this suggested that there exists tissue-preference dependent mutation rate for KRAS or NRAS gene.Inspired by the aforementioned differential gene analysis and the remarkable prominence of KRAS or NRAS as the top differential genes in subsets of tissue types, we classified tissue types into KRAS- or NRAS-engaged categories accordingly dependent upon whether KRAS or NRAS was derived as the top differential gene in corresponding tissue type. Subsequently, we conducted a direct assessment of the relationship between CRISPR effect scores for KRAS and NRAS within the context of KRAS- and NRAS-engaged tissue types and RAS mutation statuses. This analysis aimed to provide insights into the potential tissue-specific oncogenic capabilities of KRAS and NRAS mutations (Supplementary Fig. 8). A robust negative correlation was observed between the CRISPR effect scores of the KRAS gene and those of the NRAS gene, particularly under the context of RAS mutant cell lines either within KRAS and NRAS-engaged tissue types (top panel, Supplementary Fig. 8) or within all cell lines (data not shown; note: there are some tissue types in database that were not classified for their RAS engagement due to limited numbers of RAS mutant lines) when comparing to RAS wild-type (WT) cell lines. This behavior appeared to be attributed to the tissue type-specific deviation of CRISPR effect scores of mutated KRAS or NRAS genes from the WT lines in corresponding KRAS or NRAS-engaged tissue types (top panel, Supplementary Fig. 8A). Majority of NRAS and KRAS mutant lines formed two distinguished clusters separated from the main cluster of WT lines, which are consistent with their CRISPR effect scores of NRAS or KRAS gene tending to be more negative than those of WT lines (bottom panel, Supplementary Fig. 8A). This is consistent with the expectation that generally KRAS- or NRAS-engaged tissue types preferentially confer KRAS or NRAS-dependency of RAS mutants presumably through fostering their tissue-type specific mutation rates of KRAS or NRAS gene, respectively.Notably, the “conversion” behaviors of four converted lines that shifted from the clusters of their original KRAS-engaged tissue types to those of NRAS-engaged tissue types (cell lines in blue triangles indicated by blue arrows at bottom panel, Supplementary Fig. 8A), were coincident with their acquired NRAS mutations that differ from the expected KRAS mutations that their original KRAS-engaged tissue types would favorably foster (Supplementary Fig. 8B). Similarly, the “conversion” behaviors of seven converted lines that shifted from the clusters of their original NRAS-engaged tissue types to those of KRAS-engaged tissue types (cell lines in red circles indicated by red arrows at bottom panel, Supplementary Fig. 8A), were coincident with their acquired KRAS mutations that differ from the expected NRAS mutations that their original NRAS-engaged tissue types would favorably foster (Supplementary Fig. 8C). In addition, the deviation of some HRAS mutant lines from the WT lines were also coincident with their HRAS mutation driven negative CRISPR effect scores of their HRAS gene, suggesting the likely role of HRAS like KRAS or NRAS once mutated despite of their very limited number of incidences in DepMap dataset (Supplementary Fig. 8D and 8E).These compelling observations revealed that although endowed KRAS-engaged or NRAS-engaged tissue types would preferentially foster KRAS or NRAS mutations, respectively (Supplementary Fig. 8A), the bona fide acquired KRAS or NRAS mutations in those converted lines (Supplementary Fig. 8B and 8C) would override their original tissue type predisposition. Those findings not only supported the prominent pervasive oncogenic power of RAS gene mutations from another perspective, but also offered insights into the tissue-type specific permissiveness of mutant K- or N-RAS oncogenesis across a diverse spectrum of tumor types.In summary, the limma analysis consistently revealed that across multiple tissue types with sufficient samples (Table 1), either KRAS or NRAS emerged frequently as the top and uniformly as one of the top 4 genes, reinforcing their significance in the context of RAS mutant versus WT lines. Even in tissue types where RAS genes were not ranked as the top differential genes, the top genes were still found to be RAS-related genes (Table 1). Furthermore, as highlighted in red in the summary table (Table 1), like the contrast of KRAS mutant lines versus WT lines described earlier, we observed more cases where the native t-test p-value was not significant, but the adjusted p-value using the limma method was significant for many tissue types. This finding suggests that the limma model can significantly improve statistical power and provide substantial benefits compared to commonly used cherry-picking analysis of individual genes within the DepMap database, underscoring the importance of systematically assessing high-throughput data. In fact, the limma results on high-throughput CRISPR screening data of DepMap not only consolidated all significant native t-test results but also redeemed many cases that were originally not deemed significant by native t-tests (Fig. 2).Table 1 Summary of limma results including top differential genes and top genes’ statistics between RAS mutant vs. WT lines from various tissue types from DepMap database.Figure 2Largely improved statistical power and benefit using limma model comparing to commonly seen cherry-picking type of analysis of individual gene within the DepMap database. Comparison of CRIPSR effect scores in boxplots of KRAS (left) or NRAS (right) between RAS mutant vs. wild type (WT) lines from various tissue types in DepMap. Only tumor types from DepMap with at least 3 samples in both RAS (K-, N-, H-) mutant and WT lines would be used for comparison. This is similar to Supplementary Fig. 2 but here as highlighted by the additional red circles to show that the high throughput limma results of significant p-values with multiple testing essentially not only consolidate all native t-test results that are significant, but also make up many cases that were not significant by native t-test indicated by the green arrows. Also, in the cases of tissue types indicated by the yellow circles for stomach and urinary tract, where only limma raw p-value without multiple testing are significant. As shown in the volcano plots in Supplementary Fig. 9, although these two tumor types seem not calling RAS genes as significantly differential genes at level of adjusted p-value, however, as indicated by the red arrows in plots, KRAS or NRAS is still derived as the top differential gene at raw p-value level for stomach or for urinary tract respectively. Red arrow: native t-test p-value is significant; Green arrow: native t-test p-value is not significant. Note: CRISPR effect scores of each gene reflects the dependency level on the corresponding gene in the cell line, where the more negative the value of the CRISPR effect score is, the more likely this cell line is more dependent on this gene.Interestingly, for the cases of stomach and urinary tract tissue types, although limma-derived raw p-values without multiple testing were significant, limma did not classify RAS genes as significantly differential genes at the level of adjusted p-values within these tumor types. However, both KRAS and NRAS were still identified as the top differential genes at the raw p-value level (top, Supplementary Fig. 9). Additionally, in the dataset of urinary tract, the contrast of NRAS mutant versus WT lines did yield NRAS the statistical significance of multiple tests with an adjusted p-value < 0.05 (bottom-right, Supplementary Fig. 9). Similar observation was made in dataset of stomach in contrast of KRA mutant versus WT lines ((bottom-left, Supplementary Fig. 9). Collectively, this indicates the putative prominent pervasive role of RAS mutations in the oncogenesis of a very wide spectrum of tumor types.To gain a deeper understanding of the differential gene analysis results across all examined tissue types, we conducted a thorough examination of the significantly differential genes based on CRISPR effect scores between RAS mutant versus WT lines. We classified these differential genes as potentially essential genes for RAS mutant lines (2nd column of Supplementary Table 2) or as essential genes for WT lines (3rd column of Supplementary Table 2), depending on whether they exhibited more negative scores in RAS mutant lines or in WT lines, respectively. As anticipated, many of these genes are well-known oncogenic genes from the RAS pathway, as annotated by the RAS Initiative (Supplementary Tables 2 and Supplementary Fig. 6). In addition to KRAS and NRAS, several other genes were identified as essential for RAS mutant lines in multiple tissue types, such as RAF1 and SHOC2, particularly in lymph/blood and skin tissues. Conversely, other genes like BRAF, SOS1, MAPK1, and GRB2 were essential for WT lines, with PTPN11 (i.e., SHP2) previously described as essential for WT lines in both lung and lymph/blood tissues (Supplementary Tables 2 and Supplementary Fig. 6).Exhaustive computational screening demonstrated the observation of KRAS, NRAS, or RAS gene as the top gene in differential analysis not possibly occurring by random chancesOur differential gene analysis of CRISPR effect data in contrasts of KRAS vs. WT lines consistently identified the KRAS gene as the top differential gene across numerous tissue types, extending beyond the three main tissue types (i.e., lung, colon, pancreas) traditionally associated with KRAS-driven oncogenesis. To assess the possibility of these occurrences being random chances for any qualified mutated genes with sufficient cell lines harboring mutations of the corresponding gene in each tissue type that we analyzed KRAS gene in differential analysis, we conducted exhaustive computational screening of all qualified mutated genes, following the procedure described in the methods section and outlined (Supplementary Fig. 10A). The statistical results of our screening are summarized (Supplementary Fig. 10). The main theme of the findings indicated that the observed significance of KRAS cannot be attributed to random chances for any other existing mutated genes in the tissue types that we analyzed KRAS in differential analysis (Supplementary Fig. 10B). This strongly suggests that the observed results are most likely driven by the underlying RAS biology.Importantly, through these exhaustive screenings, we identified several interesting, mutated genes (column Top2Genes in the table of Supplementary Fig. 10B) that exhibited similar behavior to KRAS. When comparing KRAS mutant lines with WT lines for these genes, the limma method used in the computational screening also ranked these corresponding genes as the top differential genes. Interestingly, many of these genes are already well-known in the field, such as BRAF, PIK3CA, TP53, and CTNNB1, which are established oncogenic driver genes or tumor suppressor genes, as identified by recent computational studies on oncogenic driver genes7,8,9. Some of these genes, such as BRAF, PIK3CA, and CTNNB1, appeared across multiple tissue types (column Top2Genes in the table of Supplementary Fig. 10B). These findings suggest that amongst the top differential genes identified in these exhaustive screenings, even those that may not be familiar to us, but exhibit similar behaviors to KRAS and other well-known oncogenes, are likely to be novel oncogenic driver genes with biological relevance that was previously unknown. However, it is important to emphasize that in terms of their prevalence across tissue types, these mutated genes do not appear to be nearly as prominent and pervasive as the KRAS gene, which is consistently ranked as the top differential gene in not just one or two, but nearly all tissue types. This distinction sets the KRAS gene apart from the other top genes identified in the exhaustive computational screening.For similar purpose, to test if the behaviors of NRAS gene as the top gene in the differential analysis can be achieved by random chances, similar exhaustive computational screenings were performed to assess the possibility of these occurrences being random chances for any qualified mutated genes in each tissue type that we analyzed NRAS gene in differential analysis. The statistical results of our screening (Supplementary Fig. 10C) are indicative of a similar theme of the findings that the observed significance of NRAS cannot be attributed to random chances for any other existing mutated genes in each tissue type that we analyzed NRAS gene in differential analysis, which again suggests our observed results are most likely driven by the underlying RAS biology.Like the results obtained from the KRAS or NRAS mutant vs. WT contrast, the findings from the RAS mutant vs. WT contrast revealed even more pronounced observations, with RAS genes (i.e., KRAS or NRAS) consistently emerging as the top differential genes across a broader range of tissue types. To evaluate the possibility of achieving these results by random chances for any combination of three qualified genes with sufficient numbers of mutant cell lines, we conducted a similar computational screening procedure. In this procedure, we performed 10,000 trials of randomly selected combinations of three qualified mutated genes (ensuring sufficient sample sizes in each group of contrast) for each tissue type, resembling the three mutated RAS genes (K-, N-, H-) used in the RAS mutant vs. WT contrasts. The obtained statistics, summarized in Supplementary Tables 3 and 4 for two independent sets of unique trials for each tissue type, consistently indicated that the observed results for RAS mutants could not have occurred by random chances for any combination of three mutated genes within the DepMap database. This further supports the notion that the findings obtained from RAS mutant vs. WT contrasts are highly likely to be driven by the underlying RAS biology.Furthermore, we also identified several mutated genes (column Top4Genes in Supplementary Tables 3 and 4) that exhibited behavior like RAS genes in one or up to three tissue types. Among the identified genes, several well-known oncogenic driver genes such as ALK, BRAF, PIK3CA, PIK3R1 (components of PI3 kinase), and CTNNB1 were recognized, aligning with recent computational studies on oncogenic driver genes7,8,9. Additionally, certain genes, including BRAF and PIK3CA, emerged in both sets of unique trials, not only in a single tissue type but across multiple tissue types, providing further support for their potential oncogenic roles. Similarly, less commonly known genes like WRN were also identified in both independent sets of unique trials (Supplementary Tables 3 and 4), consistently appearing in the same tissue types (i.e., colon, ovary, and stomach), with multiple occurrences in most cases, except for the first set of unique trials for the stomach (Supplementary Table 3). These findings collectively underscore the potential significance of these identified genes as novel oncogenic drivers, deserving further investigation and exploration.To evaluate the biological significance of the differential genes obtained from the unique trials of the computational screenings, we assessed the enrichment levels of oncogenic driver genes within these genes. Specifically, we examined the enrichment of well-annotated driver genes, either defined by tumor type (green rows in Supplementary Table 5) or not defined by tumor types (green rows in Supplementary Table 6), in the differential gene lists derived from the unique trials (n = 10,000) of the computational screenings.Remarkably, we observed significant enrichment of these well-annotated driver genes in most tissue types that had at least two differential genes identified from the unique trials. This finding is particularly encouraging, considering that the annotated driver genes were primarily derived from pioneering computational studies in the field7,8,9. Furthermore, we noted the presence of noticeable driver genes even in tissue types with only two differential genes (Supplementary Tables 5 and 6), further emphasizing the biological relevance and potential significance of these genes. Overall, the enrichment analysis provides compelling evidence supporting the notion that the identified differential genes from the unique trials of the computational screenings are biologically meaningful and potentially represent novel oncogenic drivers.However, as described above, although many other oncogenic driver genes were revealed from these computational screening analyses, there is no single mutated gene that demonstrated the strength and breadth impact across multiple tissues as KRAS and NRAS. Once mutated, they behaved not nearly close to what mutant KRAS or NRAS behaved in each tissue type and consistently across tissue types. Only KRAS or NRAS was revealed as top differential genes consistently across multiple feasible tissue types with sufficient numbers of samples, whereas any other mutated genes only can do the same in one or two tissue types at the best they can do, as evident from the exhaustive computational screening results. These observations provide substantial support from another perspective for the prominent pervasive oncogenic role of RAS gene mutations.Enrichment and association analysis demonstrated oncogenic driver genes including RAS genes as the top genes out of genome-wide genes with the most significant association between the presence of mutations in a gene and dependency on this corresponding geneAccording to the DepMap database portal (https://depmap.org/portal/), a CRISPR effect score of 0 indicates a non-essential gene, while a score of -1 corresponds to the median of all common essential genes, after data normalization and standardization. We conducted a comprehensive evaluation of over 17 thousand genes in the effect score dataset, assessing the significant association between the presence of mutations in each mutated gene and its corresponding dependency, denoted by CRISPR effect scores, within nearly 1000 cell lines from the DepMap database. The enrichment level of each gene was assessed using Fisher’s exact test on a typical 2 × 2 contingency table, as described in the methods section in more detail. Notably, KRAS and NRAS emerged as the top 1 and 2 gene respectively, along with other well-known oncogenic driver genes, within the list of genes exhibiting the most significant association. These findings were corroborated by the results of Wilcoxon rank sum test and t-test (Supplementary Table 7).Remarkably, many of the top genes with significant Enrichment.Adjusted.p.Val were annotated as oncogenic driver genes in the literature (Supplementary Table 7), although many oncogenic driver genes in the top list did display significant enrichment levels when assessing individual genes and strikingly only a handful of oncogenic driver genes had the significant enrichment after multiple testing-correction. The significance levels of Enrichment.Adjusted.p.Val for KRAS and NRAS also displayed quite a dominance over any other oncogenic driver genes including BRAF and PIK3CA in the top list. Interestingly, the vast majority of significant oncogenic driver genes on the top list are exclusively from the RAS signaling pathway (except for CTNNB1). This suggests that RAS signaling may be a special case with RAS genes as the top genes having the prominent and pervasive effect on reprogramming cells to become dependent on the oncogenic changes. On the other hand, these observations are consistent very well with, if do not directly support, the unique and exceptional behaviors of RAS (K- or N-RAS) genes over any other oncogenic driver genes in the findings that RAS genes were revealed as the only genes with prominent and pervasive oncogenic roles from the exhaustive computational screening analysis results (Supplementary Fig. 10, Supplementary Tables 3 and 4).These results suggest that KRAS exhibits the most significant association between the presence of its mutations and its dependency across all tissue types. Moreover, we sought to verify if this observation holds true for each specific tissue type. As expected, the results for colon and lung consistently identified KRAS as the top gene with the most significant enrichment and association (top and middle tables in Supplementary Fig. 11). Although pancreas did not yield a significant enrichment result, likely due to the limited sample size of RAS WT lines, both Wilcoxon and t-test revealed significant association with KRAS (circled in red, bottom table in Supplementary Fig. 11). Additionally, the more powerful Barnard test, an alternative to Fisher’s exact test, indicated a significant p-value of 0.03 for KRAS in pancreas (data not shown).Interestingly, we also observed significant enrichment and associations for NRAS or KRAS within tumor types such as blood, lymphoid, and ovary (Supplementary Fig. 12). In other tissue types, while not significant at multiple test levels, KRAS and PIK3CA consistently appeared as the top genes with the most significant enrichment and association at the raw p-value level (Supplementary Fig. 12). Similarly, in several other tissue types, NRAS consistently emerged as the top gene with the most significant association, either at the multiple test level in green rows or at the raw p-value level (Supplementary Fig. 13). These findings not only support the prominent pervasive oncogenic role of RAS mutations, but also are in line with the concept of KRAS or NRAS engaged tissue types derived from differential gene analysis of CRISPR data described earlier.Other genomic data supports the findings from differential gene analysis of CRISPR screening dataWe identified numerous oncogenic driver genes from the analysis of CRISPR screening data of DepMap. To reinforce these results, we extensively searched for supporting evidence from other genomic data of DepMap, including the mutation status of the cell lines. Among the identified essential genes, we focused on BRAF as a proof of concept. BRAF was identified as the top differential gene for CRISPR effect scores between RAS mutant and WT lines, and as an essential gene for WT lines from skin origin (Supplementary Table 2).Further assessment and exploration of mutually exclusive mutated genes with RAS genes in skin cell lines revealed that BRAF was the top gene exhibiting a mutual exclusive mutated pattern with RAS genes (Fig. 3). As BRAF is a well-known oncogenic driver gene for skin cancer (e.g., SKCM) like RAS genes, these findings from the mutation data strongly supported our observation that BRAF is one of the essential genes in RAS WT skin cell lines, as identified through differential gene analysis from CRISPR effect scores. Similarly, in colon cell lines, the assessment and search for mutually exclusive mutated genes with RAS genes also highlighted BRAF as the top gene exhibiting a mutual exclusive mutated pattern with RAS genes (Supplementary Fig. 14), consistent with earlier reports26,27. While the differential gene analysis of CRISPR effect score data did not detect BRAF as a significant gene at the level of adjusted p-value in colon, it was detected at the level of raw p-value as 0.0047 through limma analysis in colon (Supplementary Fig. 14, Supplementary Table 2, detailed result for BRAF in colon not shown). Together, these lines of evidence reinforce the notion that RAS WT lines also rely on the RAS pathway or RAS-related oncogenic genes such as BRAF.Figure 3BRAF was revealed as an essential gene for RAS WT lines with the most significant mutual exclusive mutations between RAS genes in skin cell lines. Left panel: Top list of genes with mutual exclusion mutations with RAS genes in skin cell lines. Right panel: heatmap of mutation status from Top list of genes with mutual exclusion mutations with RAS genes in skin cell lines. Mutual exclusion mutation assessment was done by Fisher’s exact test with 2 × 2 contingency table created for all cell lines (whether a cell line has a mutation of this corresponding gene or not versus whether it has RAS mutation(s) or not), which was used to assess the significance of mutual exclusion.In addition to CRISPR screening data, DepMap provides gene expression data or RNAseq data for the cell lines and we did obtain proteome data outside DepMap database for those CCLE cell lines. We investigated both RNAseq and proteome data at high-level using technique of dimensional reduction including MDS and PCA that would represent the main trend of expression data (Supplementary Fig. 15). Both RNAseq and proteome data revealed overall difference in transcriptional profiles between cell lines from KRAS-engaged tissue types highlighted in a large red circle versus NRAS-engaged tissue types highlighted in a large blue circle (left panels of A and B, Supplementary Fig. 15), which supported the idea of KRAS- or NRAS-engaged tissue types that inferred from the differential analysis of DepMap CRISPR effect data.However, both RNAseq and proteome data suggest that the impact of RAS mutations on the difference between (K- or N-) RAS mutant vs. WT lines could be weaker comparing to the tissue-specific expression profiles, or such impact would confer through other mechanisms such as signaling or post-transcriptional modifications that may be explored using other omics data. (Supplementary Fig. 15). In addition, even for those from the same tissue origins, there are large variations amongst the transcriptional profiles of those converted lines (with labels in plots, described in Supplementary Fig. 8B, 8 C) with acquired KRAS or NRAS mutations that are different from what their original RAS engaged tissue types would foster. These observations suggested that there is also a possibility that other mechanisms such as signaling at protein levels rather than transcriptional changes would also likely be involved. Unfortunately, we do not have post-transcriptional modifications data for these CCLE cell lines available from DepMap database to address this possibility.Observing a large difference in transcriptional profiles between tissue types, we extracted the differentially expressed genes between RAS mutant lines and WT lines in each individual tissue type that had matched samples with CRISPR screening data. Subsequently, we utilized an in-house pathway pattern extraction pipeline (PPEP) to assess pathway enrichment in these differential gene lists across multiple tissue types (Fig. 4). As expected, our analysis revealed widespread enrichment of the KRAS signaling and PI3K signaling pathways across many tissue types (Fig. 4). This strongly suggests that the expression profiles of these RAS mutant cell lines may have undergone rewiring, likely triggered by the RAS mutations as corresponding genetic alterations, and adapted to promote the fitness of RAS mutants, thereby potentially conferring the oncogenic benefits associated with RAS gene mutations for oncogenesis.Figure 4Hallmark KRAS signaling and PI3K signaling pathways were preferentially enriched in differentially expressed genes between RAS mutant vs. WT lines across multiple tissue types derived from RNAseq data of DepMap database. After derived the differentially expressed genes between RAS mutant lines vs. WT lines in various tumor types that have matched samples with CRISPR screening data using two RNAseq analysis methods (edgeR and DESeq2), the in-house PPEP23 analysis was performed to assess how pathways are enriched in these differential gene lists across multiple tissue types. KRAS signaling and PI3K signaling pathways from various sources of pathway/geneset annotations from MSigDB database enriched widely across differential gene lists were indicated by green and pink arrows, respectively. Types: tissue types; Two methods used for deriving DEGs: e: edgeR; d: DESeq2; Annotations of RAS_Engaged_TissueTypes are based on the top differential gene as KRAS, NRAS, or genes of RAS pathway between the contrasts of KRAS, NRAS and RAS mutant vs. WT lines in each respective tissue type for differential analysis of CRISPR effect score data.Robustness of the findings was supported by a more recent DepMap dataset and data mining of computational studies on cancer driver genesAn important factor to consider is the periodic updates made to the DepMap database, which occur on a quarterly basis. Given the significant time interval between our initial analysis and the submission of this manuscript, we made the deliberate choice to utilize the most current version available at that time, specifically version 23Q2, to consolidate our results derived from the older version 21Q1 of the DepMap data. Employing this updated dataset, we conducted a comprehensive limma analysis aimed at identifying differential genes for CRISPR effect score data between RAS mutant and WT lines (as illustrated in Supplementary Figs. 16 and 17). Notably, our findings remained remarkably consistent with our earlier results, providing further validation of the robustness and confidence in our observations.Numerous studies have explored cancer driver genes using mutation data from diverse tumor types7,8,9. Although these studies do not specifically focus on RAS genes but generally on any generic oncogenic driver genes, the detailed data mining of the resulting oncogenic driver genes from these studies revealed that RAS genes were computationally predicted to be oncogenic driver genes across many tumor types (Supplementary Fig. 18). However, their findings neither have revealed RAS genes as the prominent or pervasive oncogenic driver gene distinct from other oncogenic genes (data not shown) due to the nature of their study and data sources and differences in methodology, whereas our study consistently showed in multiple threads of evidence for the prominent pervasive behavior of RAS genes in a diverse range of tissue types. In addition, our study extends beyond this by providing more precise and delineated insights into the oncogenic roles of RAS genes, specifically KRAS or NRAS, as the prominent oncogenic drivers within specific subsets of tissue types. This highlights the tissue-specific permissiveness and preference of mutant K- or N-RAS oncogenesis. Likely, these studies7,8,9 were limited by using only mutation data and protein structure information to make inferences. In contrast, our study distinguishes itself as the first to unveil insights derived from the analysis of genomic data and high-throughput gene dependency data of DepMap, which is more relevant in the context of this study. This distinction adds substantial confidence in terms of the data types and resources used, differentiating it from these primarily mutation data-based computational studies7,8,9.

Hot Topics

Related Articles