Scalable, compressed phenotypic screening using pooled perturbations

Scaling up biochemical phenotypic screens with compressionTo increase the throughput of phenotypic screens and unlock the use of high-throughput assays and model systems of increasing complexity (Fig. 1a), we examined the feasibility of compression through pooling external perturbations, such as biological ligands or chemical compounds (Fig. 1b). In this approach, we combine N perturbations into unique pools of size P, ensuring each perturbation appears in R distinct pools overall (Fig. 1b,c). Relative to a conventional screen where replicates of each perturbation are screened individually, a CS reduces the sample number, cost and labor requirements by a factor of P, which we refer to as P-fold compression. To analyze the output of a CS, we developed an assay-independent computational framework for deconvoluting the effects of individual perturbations on the basis of regularized linear regression and permutation testing (Fig. 1d), inspired by previous work inferring the effects of guide RNAs on genes in pooled CRISPR screens6,14.Developing and benchmarking CS with high-content imagingWe conducted a series of GT experiments in which we examined the impact of a 316-compound Food and Drug Administration (FDA) drug repurposing library (Supplementary Table 1) on the morphology of U2OS cells to establish a reference dataset for benchmarking the feasibility of compression (Fig. 2a). We chose this library because it would represent a challenging use case for pooling and inferring the impact of individual perturbations (because bioactive compounds with large effects would frequently co-occur in our pools); we selected this model, meanwhile, so that we could readily establish GT (as we could generate enough input material to test each perturbation independently, as well as multiple compressions). Given the size of the GT screen, we opted to measure cellular phenotypes with Cell Painting15, a cost-effective high-content morphological profiling readout that multiplexes six fluorescent dyes (probes: nuclei, Hoechst 33342; endoplasmic reticulum, concanavalin A–AlexaFluor 488; mitochondria, MitoTracker Deep Red; F-actin, phalloidin–AlexaFluor 568; Golgi apparatus and plasma membranes, wheat germ agglutinin–AlexaFluor 594; nucleoli and cytoplasmic RNA, SYTO14), imaged in five channels, to examine multiple cellular components and organelles (Fig. 2b). We constructed and applied a pipeline to analyze our Cell Painting datasets, inclusive of illumination correction, quality control, cell segmentation, morphological feature extraction, plate normalization and highly variable feature selection, yielding 886 informative morphological attributes for further analysis16 (Supplementary Table 2 and Methods).Fig. 2: Compressed screening identifies compounds with largest effects in a GT setting.a, Overview of CS benchmarking experiments used to assess the morphological impacts of 316 FDA-approved compounds on U2OS cells. b, Overview of conventional GT screen designed to address a (Methods). c, The effects of the 316 compounds were calculated relative to DMSO controls using the MD and the drug × feature matrix was clustered to identify GT drug-associated phenotypes. d, One-sided Fisher’s exact enrichments (−log10(P value)) of the features differentially enriched in each GT phenotype (log2 fold change > 3) from the seven classes of Cell Painting features (five cellular components plus area or shape and neighborhood features). Each of the seven is further broken down on the basis of whether the observation pertains to the entire cell, the cytoplasm only or the nucleus only. The righthand bar visualizes the mean number of cells per well across all samples in each GT phenotype. e, Heat map (−log10(P value), one-sided permutation test with no correction) showing the top five drugs associated with each of the eight GT clusters. f, Overview of our compressed screening benchmarking experiment, which tested a range of compressions (\({S}_{{\rm{compressed}}}=\frac{N\times R}{P}\)) for the same 316 drugs (N) by examining several pool sizes (P) and replications (R) to identify the capabilities and limits of the approach. g, CS experimental overview. h, Analytical approach for inferring the effects of each perturbation using regularized linear regression and solving for the coefficient matrix (β). i, Inferred perturbation effects in a CS (scaled L1 norm, y axis) versus those from the GT screen (MD, x axis) for two replicate runs (r, Pearson correlation; two-sided correlation test CS run 1, P value < 2.2 × 10−16; CS run 2, P value < 2.2 × 10−16). j, Correlation of the effect sizes between GT and CS runs across all pool sizes for the perturbations that were significantly associated with any of the phenotypic clusters in the GT screen (one-sided permutation test without multiple hypothesis correction, P value < 0.01). k, Receiver operating characteristic (ROC) curves (false positive rate versus true positive rate) calculated to show the performance of the CS screens in correctly identifying GT hits for each pool size while varying the deconvolution permutation testing threshold (from 0 to 1 in steps of 0.01). l, Mean scaled L1 norm of the perturbations called as hits (scaled L1 norm > 0) in both replicate CSs at each pool size (y axis). The drugs plotted are those that resulted in a statistically significant effect in any pool size (one-sided permutation test without multiple hypothesis correction, P value < 0.01). m, False negative rate, calculated as perturbations with significant MDs in the GT screen but unrecovered in CS among all significant GT perturbations, as a function of pool size in the CS.To find optimal conditions for our screen, we conventionally assayed nine concentration and time point combinations, evaluating the effects of different treatments, durations, doses and batches on cellular morphology (Extended Data Fig. 1a,b) and viability (Extended Data Fig. 1c,d and Methods). To quantify an overall morphological effect, we computed a vector of median values for each of the 886 morphologically informative features in each sample well and then calculated the average Mahalanobis Distance (MD) between the control and perturbation vectors for each compound (Fig. 2c). The MD is a multidimensional generalization of the z-score2,3,17,18 that has been used extensively to discern effect size in studies deploying high-content cellular morphological assays19,20. On the basis of these pilots, we selected the 24-h time point and 1 μM concentration combination to generate the full GT dataset because these conditions yielded the largest MD coefficient of variation (Extended Data Fig. 1a,b). We next screened all 316 compounds individually with six-fold replication, resulting in 2,088 samples (that is, wells; inclusive of 1,896 perturbation and 192 control (DMSO) samples) to define our GT (Fig. 2b). To identify shared phenotypic responses among these data, we performed dimensionality reduction over the aforementioned 886 morphologically informative features for all wells with a sufficient number of cells (≥50), yielding eight GT phenotypic clusters enriched for distinct morphological properties related to the intensity and distribution of each stain, as well as overall cell shape and structure (Fig. 2c,d, Extended Data Fig. 1e–g and Methods). One of the clusters (cluster 1) was dominated by DMSO control samples, whereas the others (clusters 2–8) were enriched for specific drug perturbations (Fig. 2e). Reassuringly, drugs driving defined morphological responses, albeit with different effect sizes (as determined by MD), shared common MOAs (for example, antineoplastics in cluster 7; Extended Data Fig. 1h,i and Supplementary Table 1).We next performed a series of matched CSs. Using the same U2OS model, Cell Painting readout and 316-drug library, we thoroughly tested the feasibility and limits of our approach by spanning different degrees of compression (3–80 drugs per pool) and replication (each drug appearing in a total of three, five or seven pools; Fig. 2f,g). After deconvolving the impact of each drug on the 886 morphological features using regularized linear regression and performing quality controls for toxicity (Fig. 2h and Extended Data Fig. 2a–c), we directly compared the CS and GT results. In the CS, the MD between a perturbation and the control cannot be computed directly because the deconvolution results in a point estimate for the effect of each perturbation (coefficient weights), including the controls, rather than a distribution. Therefore, after comparing several metrics, we chose to summarize the effect of each perturbation in the CS using the max-scaled L1 norm of all significant regression coefficients (permutation test; Extended Data Fig. 2d and Methods). We observed a strong correlation between the perturbation effects computed for the GT and CS screens for pool sizes of up to 40 perturbations, with good agreement for results derived from screens with the same pool size but distinct combinations of drugs in each pool (Fig. 2i,j and Extended Data Fig. 2e). By varying the significance threshold for our permutation tests, we found that we could tune the deconvolution to identify either more hits with a higher false positive rate or fewer hits that were more likely to be true positives (Fig. 2k and Extended Data Fig. 2f). Across all pool sizes, we consistently detected significance in the CS (scaled L1 norm > 0) for those compounds with the largest GT effects, while those with a more modest impact were only captured at lower compressions (Fig. 2l,m). Together, our benchmarking results demonstrate that CS is efficient, accurate and reproducible, and that the degree of compression can be tuned by a user to balance throughput with stringency.Examining TME ligands impact on PDAC organoids with CSWhile TME signals are critical features dictating patient survival across myriad cancers21, ex vivo systems typically overlook these microenvironmental features. This can result in a loss of therapeutically relevant transcriptional features in vitro, leading to decreased concordance with in vivo biology12,22. Understanding how multiple TME factors influence response signatures requires systematic perturbation experiments in patient-derived models that can scale with high-dimensional readouts such as scRNA-seq; however, cost, labor and biological material input requirements have limited these measurements using conventional approaches. To overcome this barrier and evaluate the utility of our approach to enable biological discovery, we used CSs to characterize transcriptional responses to 68 TME ligands in early-passage, patient-derived PDAC organoids, a model system known to drift transcriptionally in culture12,23 (Fig. 3a).Fig. 3: CS of biological ligands on PDAC organoids reveals major axes of transcriptional response that are recapitulated in single-ligand validation screens with clinical relevance.a, Overview of a CS designed to explore the effects of macrophage-derived ligands on patient-derived PDAC organoid transcriptional states. b, Overview of experimental setup for the 68 biological ligand CS (with select single-ligand landmark perturbations) on PDAC organoids with a scRNA-seq readout. c, Overview of computational analyses on scRNA-seq results from the CS used to identify GEPs and the ligands that induce them. d, Scatter plot of significant ligand effects on cNMF modules (deconvolution regression coefficients) from two CSs with distinct random pooling. e, Heat map of the mean significant ligand effects on cNMF modules over both CS runs. f, Top, overview of a single-ligand validation experiment. The top 11 hits (adjusted P value < 0.05) that emerged in both CSs were validated by performing single-ligand treatments in duplicate. Hits are grouped by the GEPs they induce, and landmark perturbations are noted with an asterisk. DMSO-treated wells served as negative controls. Bottom, Heat map visualizing the Pearson correlation between the effects of each ligand on the CS cNMF GEPs in the CS (rows) and the effects of the same ligands on the CS cNMF GEPs in the single-ligand experiment (columns). g, Top, Venn diagrams depicting the number of shared and unique genes between the cNMF PDAC IL-4 and IL-13 response GEP and corresponding signatures in MsigDB. Bottom, Kaplan–Meier survival plots of TCGA PAAD cohort (n = 182, bulk RNA-seq expression data) based on (left to right) our PDAC IL-4 and IL-13 response module or three gene signatures (Reactome, Biocarta and Lu et al. IL-4) from MsigDB.More specifically, we conducted two PDAC organoid CSs with the same model, ligand library and compression scheme (replicates = 5, mean pool size = 4.75), but distinct ligand poolings (Fig. 3b). In addition to negative controls (wells containing only organoids and media), we individually screened three ligands (tumor necrosis factor (TNF), transforming growth factor-β1 (TGFβ1) and interferon-γ (IFNγ)) with known activity in PDAC12,24 as positive controls (landmarks). After sample demultiplexing and quality control, we analyzed 5,662 cells and 10,881 cells from the two runs, with a mean ± s.d. of cells per compressed pool of 59 ± 32 and 113 ± 65, respectively (Extended Data Fig. 3a,b).As PDAC cells in these screens were simultaneously exposed to multiple perturbations, we reasoned that a given cell might concurrently respond to one or more ligands within the pool by expressing one or more GEPs. We discerned such GEPs by applying consensus non-negative matrix factorization (cNMF)25 (Fig. 3c). This approach revealed 13 GEPs with highly variable activity across cells that were consistent between the two runs (Extended Data Fig. 3c, Supplementary Table 3 and Methods). We contextualized these GEPs by correlating GEP expression scores with those for canonical gene signatures previously used to characterize PDAC cell states (Moffitt et al. classical and basal state signatures26), existing MsigDB and PROGENy genesets, signatures associated with our landmark (that is, single cytokine) perturbations (TNF, TGFβ and IFNγ), cell-cycle scores27 (G2M and S phase) and cell quality metrics (‘percentage ribosomal’, ‘percentage mitochondrial’ and ‘number of genes’). This allowed us to annotate 11 of the GEPs as nuclear factor (NF)-κβ activation, TGFβ response, type I IFN response, organoid classical, cell cycle S phase, cell cycle G2M phase, mitochondrial module, ribosomal module and three GEPs primarily expressed in low-complexity cells (Extended Data Fig. 3d,e and Methods)28. One of the two remaining GEPs displayed a negligible association (R < 0.03) with the Hallmark IFNγ response in MsigDB, yet clearly correlated with the module score for genes differentially expressed in the IFNγ-positive control wells (‘landmark perturbation effects’, Extended Data Fig. 3d). The final GEP did not clearly correlate (R > 0.25) with any signature in MsigDB or PROGENY (Extended Data Fig. 3d). We annotated it as the ‘PDAC interleukin (IL)-4 and IL-13’ signature on the basis of the ligands inferred to drive the observed response, supported by literature suggesting that some of its top ranked genes (CD55, PIGR and SERPINB3) can be induced by IL-4 and IL-13 in epithelial cells29,30,31 (Extended Data Fig. 3e) and the presence of detectable IL4I1, a downstream target gene directly induced by IL-4, in SPP1+ macrophages in the primary patient PDAC data from Raghavan et al.12 (Extended Data Fig. 3f). We note that related attempts to use gene-based deconvolution were highly overdetermined (given the ratio of ligands to genes), yielding a small, nondeterministic set of genes associated with each ligand that were difficult to interpret using standard enrichment analyses.Next, we applied our deconvolution framework to infer which ligands drove the activity of each cNMF GEP in the CS (Fig. 3d,e and Extended Data Fig. 3g,h). Encouragingly, IFNγ was associated with the IFNγ response GEP, IFNɑ2 was associated with the type I IFN signaling GEP, TGFβ and inhibin-βB (INHBB; a member of the TGFβ superfamily) were associated with the TGFβ response GEP, IL-1A, IL-1B and TNF (known activators of NF-κβ32) were associated with the NF-κβ response GEP and the mitogens Wnt7A and R-Spondin 3 (RSPO3) were associated with the ribosomal module GEP33,34. The low-complexity GEP 1, meanwhile, did not consistently associate with any ligand and was ignored as a quality artifact in downstream analyses. As stated above, we annotated the PDAC IL-4 and IL-13 response GEP on the basis of its association with IL-4 and IL-13, as well as with adiponectin (ADIPOQ). Interestingly, we would not have been able to infer which ligands were most active in the organoids by examining cognate receptor expression alone, as our active and inactive ligands did not have discernable patterns of cognate receptor gene expression in untreated organoid cells or in patient-matched primary tumor cells (Extended Data Fig. 3i and Supplementary Table 4). Together, our deconvolution framework paired the major axes of induced phenotypic variation in PDAC organoids with the TME ligands driving those transcriptional variations.To validate our CS results, we individually tested the 11 top-scoring ligands (IL-1A, IL-1B, TNF, IL-4, IL-13, ADIPOQ, TGFβ1, INHBB, IFNγ, Wnt7A and RSPO3) significantly associated with five of the aforementioned GEPs (NF-κβ response, PDAC IL-4 and IL-13 response, TGFβ response, IFNγ response and ribosomal module; Fig. 3f)23. We conducted two validation experiments on separate days (biological replicates) by screening six replicates of all ligands, including a set of negative control organoids in each run (Extended Data Fig. 4a). We then pooled these data with the negative controls and individual landmark ligand positive controls from the original CSs to form a final single-ligand perturbation dataset. To directly control for batch variation between runs (Extended Data Fig. 4a and Methods), we performed cNMF over the single-ligand dataset and then ran a linear model to predict the effect of each ligand on each single-ligand cNMF GEP while including experimental batch as a covariate. This identified single-ligand GEPs with similar enrichments to, and high correlations with, each of the GEPs from the CS dataset (Extended Data Fig. 4b–d, Supplementary Table 3 and Methods). For example, there was significant overlap between the CS and single-ligand PDAC IL-4 and IL-13 response GEPs (Fisher’s exact test P value = 1.62 × 10−26), and the overlapping genes ranked highly by gene loading in both modules (Extended Data Fig. 4e).More critically, these data revealed that the majority of the validation ligands significantly induced expression of the expected single-ligands GEPs (that is, those most similar to the CS GEPs discovered in the CS), with all five target GEPs induced by at least one validation ligand (Extended Data Fig. 4f). To cross-check these results for congruence, we also ran a linear fit for the validation data using the CS GEPs in place of the single-ligand ones and obtained similar results (Extended Data Fig. 4g). Lastly, we observed a strong correlation between ligand effects on CS cNMF GEPs in both the CS and the single-ligand validation experiments (Fig. 3f and Methods), demonstrating that compressed screening can robustly identify the transcriptional effects of TME ligands on patient-derived cancer models.In examining the GEPs in our CS and validation experiments and their associated ligands, we observed limited overlap with matched canonical ligand signatures, suggesting the possibility of disease- and context-specific transcriptional responses to TME signals (Extended Data Figs. 3d and 5a). For example, the PDAC IL-4 and IL-13 response GEP did not correlate well with other signatures within established geneset databases12,35 or those associated with PDAC outcomes26 (Extended Data Figs. 3d and 5a–e), likely explained by the limited overlap between the PDAC IL-4 and IL-13 response GEP and publicly available IL-4 and IL-13 gene signatures (Fig. 3g and Supplementary Table 5). Even for other response GEPs that had more obvious correlations with existing gene signatures (for example, those for TNF; Extended Data Figs. 3d and 4b), we observed only modest overlaps; for example, our PDAC NF-κB GEP shared only 20 genes with its matched PROGENY signature (Jaccard index = 0.058), our PDAC TGFβ GEP shared only 6 genes with the Hallmark TGFβ signature (Jaccard index = 0.047) and our PDAC IFNγ GEP shared only 12 genes with the Hallmark MSigDB signature (Jaccard index = 0.031; Extended Data Fig. 5f). Taken together, this reveals the importance of understanding and considering the contextual background against which perturbations are introduced as specific systems may respond differently to the same perturbation.To explore the potential clinical relevance of these disease-specific responses, we examined the prognostic value of our PDAC IL-4 and IL-13 response GEP and existing IL-4 and IL-13 response signatures in bulk RNA-seq samples from PDAC tumors in The Cancer Genome Atlas (TCGA)35. Comparing the top and bottom tertile for each IL-4 and IL-13 signature, we found that only our newly identified PDAC IL-4 and IL-13 GEP significantly stratified patients by survival, with higher expression associating with worse outcome (Fig. 3g). The same was true for our PDAC IFNγ and TGFβ GEPs, both of which had minimal overlap with their corresponding public signatures and showed better prognostic ability than their literature counterparts (Extended Data Fig. 5f,g). By further analyzing genes differentially expressed between IL-4- and DMSO-treated cells in our single-ligand perturbations (Extended Data Fig. 5h), we found that the top genes expressed in the IL-4 treatment group (MMP1, CD55 and DUOX2, also observed in our single-ligand PDAC IL-4 and IL-13 response GEP; Extended Data Fig. 4c) are known to be involved in, and enriched in a signature for, epithelial-to-mesenchymal transition (EMT), a process that promotes metastasis of cancer cells36,37,38, potentially explaining the lower overall survival we observed for patients highly expressing this GEP (Fig. 3g and Extended Data Fig. 5i,j). Collectively, these results suggest that the PDAC IL-4 and IL-13, IFNγ and TGFβ response GEPs discovered through our approach represent disease-specific responses with prognostic implications that would have been missed using existing databases. More broadly, our data highlight that disease- and context-specific transcriptional responses to TME signals may result in distinct functional consequences, and emphasize the importance of augmenting and carefully considering existing gene signature databases.CS dissects systems-level drug responses in primary samplesMany therapies act in a pleiotropic manner across the diverse cell types in a tissue, yet most of our drug screening is carried out in homogenous, immortalized cell lines. Accordingly, we often have a limited understanding of how perturbing a target molecule and its associated signaling network will impact complex cellular systems, ultimately registering therapeutic benefit or toxicity. As one foray into this problem, we leveraged a CS approach to scale small-molecule screening in fresh human PBMCs, a complex multicellular system, with a high-content scRNA-seq readout.PBMCs are composed of a heterogeneous mixture of T cells, natural killer (NK) cells, B cells and monocytes, among others. A key advantage of using freshly isolated PBMCs is their close representation of peripheral host physiology. Here, we simulated immune activation in this compartment with both LPS (a Toll-like receptor 4 agonist) and IFNβ (a key antiviral cytokine), canonical stimuli that can be sensed by multiple immune subsets through well-described pathways, and that have been used to explore immune responses to Gram-negative bacteria and viruses, respectively39,40. Critically, while immunomodulators are commonly used in the clinic41,42,43,44,45, it is challenging to screen the transcriptional impact of multiple compounds on fresh human PBMCs because of material, cost and labor constraints. Thus, we hypothesized that a CS could serve as a bridge in the ‘chain of translatability’, allowing a better understanding of how potent and specific molecules manifest desirable or deleterious systems-level effects.To examine the power of a CS in this setting, we tested the effects of 90 chemical compounds with a wide range of known MOAs curated by the Broad Drug Repurposing Hub (https://github.com/jump-cellpainting/JUMP-MOA) on immune responses among PBMCs (Fig. 4a). To execute these CSs, we generated identical pool sets (R = 3 replicates per drug and P = 6 drugs per well) in three 96-well plates with DMSO controls, added fresh PBMCs, incubated overnight and then stimulated with IFNβ or LPS for 4 h the following day while leaving one plate unstimulated (Fig. 4b). Following an evaluation of cell viability, each well was hashed using oligo-tagged antibodies and scRNA-seq was performed. After sample demultiplexing and quality control, we retained a total of 20,062 control cells (mean cell count per well ± s.d., 418 ± 351), 25,617 in the IFNβ plate (524 ± 353) and 65,868 in the LPS plate (1,372 ± 805). These yields highly correlated with our cell viability data, and we observed that both stimulations maintained cell counts relative to the unstimulated condition (Wilcoxon test one-sided P value < 0.01; Fig. 4c and Supplementary Fig. 1a,b). Running deconvolution on cell viability data revealed a handful of compounds with significant effects (Supplementary Fig. 1c). To understand how each compound impacted responses to each stimulus, we merged cell types from the control and stimulus conditions, and then ran cNMF on those cell types individually to identify GEPs (Fig. 4c, Supplementary Fig. 1d–f, Supplementary Table 6 and Methods). We avoided running cNMF on all three conditions at once to prevent effects from one stimulation being masked by the other. This analysis produced two main categories of GEPs: (1) ‘stimulant–response modules’, which changed significantly in response to stimulation, and (2) ‘cell-identity modules’, which did not.Fig. 4: CS of an MOA drug library identifies modulators of immune responses across human PBMCs.a, Overview of CS for examining the immunomodulatory effects of a MOA drug library on the responses of human PBMCs to stimulation with either LPS or IFNβ. b, Experimental overview. c, Left, overview of data processing and analysis workflow. Right, overview of the computational workflow used to analyze the CS data. d, Summary statistics of the number of affected stimulant–response GEPs and number of affected cell types for each molecule in each stimulation condition (LPS or IFNβ). Permutation test P-value cutoff = 0.01. e, Classification of the impact of a compound on cell-type-specific immune modulation. f, Example of a compound (δ-tocotrienol) with diverse effects within the same cell type and across different cell types. The usage scores for each GEP were normalized against the median of the DMSO + IFNβ condition. A Mann–Whitney U-test (two-sided, no correction) was performed to test score differences (Methods). The bottom, center and top lines of the box plots indicate the 25th percentile, median and 75th percentile, respectively. Whiskers are drawn to the farthest datapoints within 1.5× the interquartile range from the nearest hinge. Sample size by stimulation condition: DMSO, n = 396 (monocytes), n = 1,730 (CD4) and n = 404 (NK cells); IFNβ, n = 197 (monocytes), n = 1,276 (CD4) and n = 265 (NK cells); IFNβ + δ-tocotrienol, n = 41(monocytes), n = 631 (CD4) and n = 53 (NK cells). g, Ruxolitinib acts as an IFNβ interferer for CD4 T cell GEP3. A Mann–Whitney U-test (two-sided, no correction) was performed to test score differences (Methods). The bottom, center and top lines of the box plots indicate the 25th percentile, median and 75th percentile, respectively. Whiskers are drawn to the farthest datapoints within 1.5× the interquartile range from the nearest hinge. Sample sizes: DMSO, n = 1,730; IFNβ, n = 1,276; IFNβ + ruxolitinib, n = 1,356. h, Heat map showing the effects of all 90 compounds on PBMC responses to IFNβ (top) and LPS (bottom). Color bars represent the strength of the effect. Negative values denote a negative potentiator or interferer while positive values denote a positive potentiator or interferer. Permutation test P-value cutoff = 0.01. i, Low-dimensional embedding (UMAP) of single-compound validation experiment of ruxolitinib’s effect on LPS responses. j, Evaluation of CS LPS response module (CD8 T cell GEP3) in ruxolitinib-only perturbation. The bottom, center and top lines of the box plots indicate the 25th percentile, median and 75th percentile, respectively. Whiskers are drawn to the farthest datapoints within 1.5× the interquartile range from the nearest hinge. Sample sizes by condition: DMSO, n = 112 (B cells), n = 648 (CD4), n = 376 (CD8), n = 27 (dendritic cells (DCs)), n = 215 (monocytes), n = 188 (NK cells) and n = 27 (γδ T cells); LPS, n = 130 (B cells), n = 1,004 (CD4), n = 478 (CD8), n = 10 (DCs), n = 48 (monocytes), n = 264 (NK cells) and n = 41 (γδ T cells); LPS + ruxolitinib, n = 62 (B cells), n = 475 (CD4), n = 299 (CD8), n = 5 (DCs), n = 19 (monocytes), n = 152 (NK cells) and n = 28 (γδ T cells).To determine the impact of each stimulant and drug and their potential synergies, we modified our deconvolution framework to account for possible interactions between the immune stimulant and each pooled drug. Specifically, we included a Boolean interaction term in our design matrix to denote whether the drug and stimulant coexisted in a well (Xinteraction) and modified our coefficient matrix β to include fits for all three features (that is, βstimulants, βdrugs and βinteractions). Overall, we observed that several cell type stimulant–response modules were influenced by the compounds in our MOA library, with some significantly influencing multiple cell types and/or response GEPs (Fig. 4d). Our modified deconvolution method enabled us to compare the signs of βinteractions and βdrugs against those in βstimulants for specific combinations to understand the nature of stimulant–drug interactions in each cell type (Supplementary Figs. 2 and 3). Specifically, we divided our drugs into three categories on the basis of the significance of their impact on stimulant–responses in each cell type: (1) potentiators, which strengthen the effect of stimulant-mediated downstream signaling; (2) interferers, which oppose the effect of the response to a given stimulant; and (3) noninteractors, which do not have an effect on stimulant responses (Fig. 4e, Extended Data Fig. 6 and Methods). For potentiators and interferers, we further annotated them as ‘positive’ or ‘negative’ on the basis of whether the stimulant increases or decreases the GEP (Methods). As in the previous examples, one of the key parameters for this computational workflow is the P-value threshold we used in the deconvolution permutation test; as the P-value cutoff becomes more stringent, fewer drugs are nominated to have effects in each cell type (Extended Data Fig. 7a).Our IFNβ CS identified compounds with diverse effects on multiple GEP and cell type combinations. In the 90-molecule library, 87 molecules had an impact on IFNβ responses in at least one cell type, and 73 had an impact in multiple cell types (≥2). Notably, 20 molecules had statistically significant effects in four or five cell types, often with high pleiotropy. δ-Tocotrienol (vitamin E, an HMG-CoA reductase inhibitor), for example, has been described as a broadly anti-inflammatory agent; here, we observed variable effects on different immune cell subset GEPs, including opposing effects (potentiating and interfering) on different IFNβ response modules (GEP1 and GEP5) in monocytes, as well as interference with response GEPs in CD4 T cells (GEP3) and NK cells (GEP6) (Fig. 4f). Similarly, ponatinib, known to target multiple receptor tyrosine kinases46, elicited significant effects on IFNβ responses across multiple cell types, acting as a positive interferer of IFNβ GEP3 in B cells (Janus kinase (JAK)– signal transducer and activator of transcription (STAT) and TNF signaling), IFNβ GEP3 in CD4 T cells (JAK–STAT signaling) and IFNβ GEP1 in CD8 T cells (JAK–STAT and TNF signaling), but also as a negative interferer of IFNβ GEP8 in B cells (cell cycle and NF-κB) and IFNβ GEP3 in CD8 T cells (mitochondrial activity and adenosine triphosphate synthesis; Fig. 4f and Extended Data Fig. 7b–e). We note that, although three of the targeted IFNβ response GEPs were annotated as JAK–STAT signaling, ponatinib modulated both shared and unique genes across the cell types, once again highlighting context-specific signatures (Extended Data Fig. 7b, bottom).Next, we examined drugs with targets known to modulate immune responses. One of the major effects of type I IFN stimulation is the activation of JAK–STAT pathways41,42,43,44,45,47. Ruxolitinib, a known JAK–STAT inhibitor48, acted as a positive interferer of a CD4 T cell GEP (IFNβ GEP3), which positively correlated with JAK–STAT signature expression, giving us confidence that our CS is detecting real biological signals (Fig. 4g and Extended Data Fig. 7d). Relatedly, our screen also nominated rapamycin, a mammalian target of rapamycin (mTOR) inhibitor49, as a negative interferer of a GEP in B cells (IFNβ GEP8), which negatively correlated with JAK–STAT and TNF signaling; it was also a positive interferer of another GEP in B cells (IFNβ GEP3) that positively correlated with JAK–STAT and TNF signaling (Extended Data Fig. 7c and Extended Data Fig. 8a). These results are in line with the known cooperativity between mTOR and JAK–STAT pathways, and the previously reported role of rapamycin in destabilizing TNF expression50,51. Taken together, we found rich response patterns across most of the chemical compounds in the library (Fig. 4h, Extended Data Figs. 7 and 8 and Supplementary Figs. 2 and 3), with monocyte GEPs impacted by the greatest number of molecules (n = 64), followed by B cells (n = 15) and CD4 T cells (n = 14) (permutation test P value < 0.01).Relative to IFNβ, a similar analysis applied to LPS resulted in fewer stimulant–response modules at the same P-value threshold (Extended Data Fig. 9a); however, the impact of the JUMP MOA library on those modules was equally complex (Fig. 4h, Extended Data Fig. 6 and Supplementary Fig. 2). In total, 86 of 90 molecules had some effect on LPS responses in at least one cell type, 65 molecules had some effect in two or more cell types and 8 moelcules had some effect in four or five cell types. A representative compound with significant effects on LPS responses across multiple cell types is merimepodib, an inosine monophosphate dehydrogenase inhibitor that reduces intracellular guanine-based nucleotides, leading to the inhibition of DNA and RNA synthesis52. This broad-impacting MOA translated in our CS into pleiotropic effects across multiple cell types: Merimepodib acted as a positive interferer of B cell LPS GEP3 (TNF signaling and amine biosynthesis and metabolic processes) and CD8 T cell LPS GEP3 (JAK–STAT signaling), a positive potentiator of monocyte LPS GEP7 (translational and ribosomal activities) and a negative interferer of CD4 T cell LPS GEP1 (mitochondrial activities) (Extended Data Fig. 9b–f). BIX02188, a known mitogen-activated protein kinase (MAPK) kinase inhibitor, meanwhile, was found to be a positive interferer of LPS GEP4 in NK cells, a module highly correlated with TNF signaling and chemokine receptor (NK T cell pathways) expression (Extended Data Fig. 10a,b). Notably, it was reported that LPS-induced TNF production requires activation of the MAPK pathway53,54. In addition, PF 477736, a checkpoint kinase 1 inhibitor55, was one of the top positive potentiators of LPS GEP7 in monocytes, a module highly correlated with ribosomal events and translation elongation (Extended Data Figs. 9c and 10b). This is biologically consistent with the known role of PF 477736 in abrogating cell-cycle arrest, which in turn increases the rate of ribosomal and translational events. Interestingly and unexpectedly, we also found that AZD7545 and BX-912, two drugs with the same MOA annotations (pyruvate dehydrogenase kinase inhibitor), had divergent effects on the same LPS GEP7 in monocytes, a module highly correlated with RNA translation activity (Extended Data Fig. 9c and 10c). In the LPS stimulation screen, monocytes were again the cell type impacted by the largest number of molecules (n = 40), followed by CD8 T cells (n = 30) and NK cells (n = 11) (permutation test P value < 0.01).Across the entire library, 53 compounds had effects on both LPS and IFNβ responses in multiple cell types. For example, ML298 hydrochloride, a selective phospholipase 2 inhibitor56, was a positive interferer of NK cell LPS GEP7 (KRAS downregulated signal), a positive potentiator of monocyte LPS GEP7 (translational and ribosomal activities) and a negative interferer of CD4 T cell LPS GEP1 (mitochondrial activities). Meanwhile, it was a positive interferer of B cell IFNβ GEP3 (JAK–STAT and TNF signaling) and CD8 T cell IFNβ GEP1 (JAK–STAT and TNF signaling), a negative interferer of B cell IFNβ GEP8 (cell cycle and NF-κB) and a negative potentiator of monocyte IFNβ GEP5 (transmembrane transport and generation of precursor metabolite and energy) (Extended Data Figs. 7c,e, 8b, 9c,f and 10a,d). Furthermore, as with IFNβ GEP3 in CD4 T cells, we found a similar interfering effect by ruxolitinib on LPS GEP3 in CD8 T cells treated with LPS, a module also highly correlated with JAK–STAT signaling (Extended Data Fig. 9e and Extended Data Fig. 10e). Taken together, the consistency of our findings for ruxolitinib in both the LPS and the IFNβ screens and the identification of several new stimulant–drug interactions lend credence to the power of CS in identifying phenotypic and functional effects even in a complex multicellular system.To independently corroborate pleiotropy, we tested the effects of ruxolitinib on LPS stimulation conventionally. Briefly, cells were seeded in triplicate at the same density as in the CS, treated with ruxolitinib overnight and then acutely stimulated with LPS for 4 h, followed by scRNA-seq (Fig. 4i). These data replicated ruxolitinib’s ability to prevent LPS-induced activation of the JAK–STAT pathway in CD8 T cells (GEP3). We observed that ruxolitinib also inhibited the expression of genes associated with CD8 T cell GEP3 in CD4 T cells, γδ T cells, B cells and NK cells (Fig. 4j). In our LPS CS screen, ruxolitinib was also nominated as a positive interferer of LPS GEP3 in B cells, a module correlated with TNF signaling and amine biosynthesis and metabolic processes (Extended Data Fig. 9d). Our validation data confirmed this effect in B cells, and identified similarly significant interference effects in CD4 and CD8 T cells, NK cells and γδ T cells (Extended Data Fig. 10f). Intriguingly, ruxolitinib upregulated the expression of both these modules in monocytes (Fig. 4j and Extended Data Fig. 10f). We note that these opposite trends in monocytes were unexpected, were not technical in nature, and call for more detailed investigation.Overall, our results in PBMCs show how different compounds can either potentiate or suppress LPS and IFNβ GEPs in a cell-type-specific manner, even producing divergent responses in different cell types. Here, relative to the PDAC CS, we were able to achieve a second layer of compression by simultaneously profiling the impact of multiple drugs on multiple cell types. More generally, these results help establish the power of the CS framework for revealing systems-level information from phenotypic screens while reducing sample material, reagent and labor costs (here, by a factor of 5.5, roughly corresponding to the size of our pools P, relative to a conventional screen; Supplementary Table 7).

Hot Topics

Related Articles