Low concentration cell painting images enable the identification of highly potent compounds

In the previous section, we report results for 57 assays chosen using a list of criteria listed at the beginning of the Result section. Since we need a systematic way to evaluate performance, the assay criteria are fairly rigid, hence there is low assay diversity. In fact, 43 out of 57 assays are Cell Proliferation assays. The others are 11 kinase assays, and 3 assays on other protein targets.Therefore, in this section, to increase assay diversity, we manually pick a number of off- and on-target assays to test our low concentration image method on. These assays are different from the previous 57 assays in several ways: the pIC50 threshold where we consider a compound moderately or highly potent can be different, or there are plenty of positive samples, or the assays can even measure something that is not pIC50. Our aim is to demonstrate how we apply the method in these different settings, and in which setting the method works well.Phospholipidosis assayTable 2 PLD metrics table. A high potency precision increase as inference image concentration decreases can be observed, indicating that the pIC50\(\ge\)4.6 model has been repurposed for retrieving highly potent compounds with pIC50\(\ge\)6. The best model in terms of high potency AUC-ROC and AUC-PR is \([20\; \upmu {\text{M}} / 4\; \upmu {\text{M}} / 4.6]\), outperforming the conventional method \([20\; \upmu {\text{M}} / 20\; \upmu {\text{M}} / 6]\).Drug-induced phospholipidosis (PLD) is characterized by the excess accumulation of phospholipids in tissues. Organs affected by phospholipidosis exhibit inflammatory reactions and histopathological changes14. Hence, PLD is considered an adverse effect and PLD assay is an essential liability assay to screen in early drug development.For PLD assay, the model with the best AUC-ROC we have is the pIC50\(\ge\)4.6 model, which is the lowest potency threshold considered for this assay. The model with the highest potency threshold is pIC50\(\ge\)6, which has the worst AUC-ROC out of all our PLD models. The active ratio in the training set, when potency threshold is at 4.6 and 6, is 0.776 and 0.109, respectively. Our aim is to investigate whether we can repurpose a pIC50\(\ge\)4.6 model to a model which specifically retrieves highly potent compounds pIC50\(\ge\)6 (Hypothesis 1), and whether this method can outperform the pIC50\(\ge\)6 model (Hypothesis 2).It can be seen from the increase in high potency precision from 0.209 to 0.429 in Table 2 that using low concentration images for inference can help specifically retrieve highly potent compounds, indicating that Hypothesis 1 holds. Although for this case, it is interesting to note that the model with the highest high potency precision is \([20\; \upmu {\text{M}} / 20\; \upmu {\text{M}} / 6]\) at 0.462. The low concentration model \([20\; \upmu {\text{M}} / 8\; \upmu {\text{M}} / 4.6]\) comes close at 0.429.Hypothesis 2 holds, as the best model in terms of high potency AUC-PR and AUC-ROC is \([20\; \upmu {\text{M}} / 4\; \upmu {\text{M}} / 4.6]\), achieveing 0.0970 and 0.731, respectively. On the other hand, model \([20\; \upmu {\text{M}} / 20\; \upmu {\text{M}} / 6]\) AUC-ROC is close to the random baseline 0.5, likely due to few positives to train the model at pIC50 threshold 6, as the active ratio at potency threshold 6 is only 0.109. Interestingly, model \([20\; \upmu {\text{M}} / 20\; \upmu {\text{M}} / 6]\) achieves the highest high potency precision, but relatively low high potency AUC-PR and AUC-ROC at the same time. This is because this model misclassifies a lot more positives compared to other models.BSEP assayTable 3 BSEP metric table. A high potency precision increase as inference image concentration decreases can be observed, indicating that the pIC50\(\ge\)4.5 model has been repurposed for retrieving highly potent compounds with pIC50\(\ge\)5.5. The best model in terms of high potency AUC-ROC and AUC-PR is \([20\; \upmu {\text{M}} / 20\; \upmu {\text{M}} / 5.5]\). In this case, our method does not outperform the conventional method.Bile salt export pump (BSEP) is the major transporter for the secretion of bile acids from hepatocytes into bile in humans. BSEP inhibition may contribute to the initiation of human drug-induced liver injury (DILI)15. Since DILI is a frequent cause of drug failure in development, early screening of BSEP is also vital in early drug discovery.For BSEP assay, pIC50\(\ge\)4.5 and pIC50\(\ge\)5.5 classifiers are the lowest and highest potency BSEP models that we build. The active ratio in the training set, when potency threshold is at 4.5 and 5.5, is 0.825 and 0.400, respectively. We are investigating whether we can repurpose the pIC50\(\ge\)4.5 model to retrieve highly potent compounds at pIC50\(\ge\)5.5 (Hypothesis 1), and whether our method outperforms the high potency pIC50\(\ge\)5.5 model (Hypothesis 2).Hypothesis 1 holds, as shown in Table  3. Models \([20\; \upmu {\text{M}} / 0.16\; \upmu {\text{M}} / 4.5]\), \([20\; \upmu {\text{M}} / 0.8\; \upmu {\text{M}} / 4.5]\) and \([20\; \upmu {\text{M}} / 4\; \upmu {\text{M}} / 4.5]\) specifically retrieve compounds with pIC50\(\ge\)5.5, with significantly less false positives than the \([20\; \upmu {\text{M}} / 20\; \upmu {\text{M}} / 4.5]\) model.In this case, the high potency model pIC50\(\ge\)5.5 is the best model in terms of high potency AUC-PR and AUC-ROC (Table  3), indicating that there are still enough positive samples for training of the high potency model. Our method does not lead to an improvement in high potency AUC-PR or AUC-ROC, hence Hypothesis 2 does not hold.Immunology on-target assayThis is an assay for an immunology protein target. pIC50\(\ge\)5.3 and pIC50\(\ge\)6 classifiers are the lowest and highest potency models that we build for this assay. Hence, for this case we are investigating whether we can repurpose the pIC50\(\ge\)5.3 model to retrieve highly potent compounds at pIC50\(\ge\)6 ((Hypothesis 1)), and whether our method outperforms the high potency pIC50\(\ge\)6 model (Hypothesis 2). The active ratio in the training set, when potency threshold is at 5.3 and 6, is 0.437 and 0.0765.In terms of Hypothesis 1, we observe a trend of increasing high potency precision as image concentration for inference decreases (Table  4), indicating that the model can be repurposed for highly potent compound retrieval. However, the high potency precision increase for this assay is smaller than in other cases, from 0.739 at \(20\; \upmu {\text{M}}\) to 0.757 at \(0.8\; \upmu {\text{M}}\), and the increase is not as monotonic.Despite the low active ratio 0.0765, the high potency model \([20\; \upmu {\text{M}} / 20\; \upmu {\text{M}} / 6]\) performs well with AUC-ROC score of 0.718 and AUC-PR score of 0.258. But our method, specifically models \([20\; \upmu {\text{M}} / 0.8\; \upmu {\text{M}} / 5.3]\) and \([20\; \upmu {\text{M}} / 4\; \upmu {\text{M}} / 5.3]\), improve on these scores, achieving 0.347 AUC-PR and 0.821 AUC-ROC, and 0.328 AUC-PR and 0.861 AUC-ROC, respectively. This indicates Hypothesis 2 holds for this assay.Table 4 Immunology target metrics table. A high potency precision increase as inference image concentration decreases can be observed, indicating that the pIC50\(\ge\)5.3 model has been repurposed for retrieving highly potent compounds with pIC50\(\ge\)6. However, the increase is smaller and not as monotonic as the previous cases. The best model in terms of high potency AUC-ROC is \([20\; \upmu {\text{M}} / 4\; \upmu {\text{M}} / 5.3]\), and in terms of high potency AUC-PR is \([20\; \upmu {\text{M}} / 0.8\; \upmu {\text{M}} / 5.3]\), both outperforming the conventional method \([20\; \upmu {\text{M}} / 20\; \upmu {\text{M}} / 6]\).Glu/Gal assayAnother important off-target assay in early drug development is Glu/Gal. Glu/Gal is the primary assay of choice for drug-induced mitochondrial toxicity16. Briefly, mitochondrial dysfunction is determined by the ratio of the test substance to induce cytotoxicity in glucose and galactose culture conditions1, hence the Glu/Gal nomenclature. The measure of mitochondrial toxicity is a ratio of two IC50 values, not one pIC50 value as in previous cases. The higher the Glu/Gal ratio is, the more indicative that the compound induces mitochondrial toxicity.Instead of retrieving highly potent compounds, in this case we will test whether our method can specifically retrieve highly toxic compounds from this assay. We consider compounds with Glu/Gal ratio \(\ge\)5 to be highly toxic, and ratio \(\ge\)2 to be moderately toxic. The active ratio in the training set, when toxicity threshold is at 2 and 5, is 0.595 and 0.275, respectively. It is also worth noting that the Glu/Gal ratios are distributed on a wide range (up to 500), but we are only interested in the thresholds in the narrow range of 2 to 5. This is because we consider every compound with ratio \(\ge\)5 to be equally toxic. As a result, compounds in the test set will only have Glu/Gal ratios between 0 to 20.It can be observed in Table  5 that high toxicity precision increases from 0.483 of model \([20\; \upmu {\text{M}} / 20\; \upmu {\text{M}} / 2]\) to 1 of model \([20\; \upmu {\text{M}} / 0.8\; \upmu {\text{M}} / 2]\). This shows using low concentration images for inference can specifically retrieve highly toxic compounds. This shows that Hypothesis 1 holds for this assay. We note that similar to PLD, in this assay inference using \(0.16\; \upmu {\text{M}}\) images returns no active compounds. It is because the compound concentration is too low that no signal related to these activities are induced in the cell.Regarding Hypothesis 2, the high toxicity model \([20\; \upmu {\text{M}} / 20\; \upmu {\text{M}} / 5]\) remains the best performing model (Table  5) at 0.226 high toxicity AUC-PR and 0.713 high toxicity AUC-ROC. In this case, 27\(\%\) of the labels are positive for classification of highly toxic compounds with Glu/Gal ratio\(\ge\)5, which is plenty of positive examples for a high toxicity model training. Our method does not outperform the connventional method for this assay.Table 5 Glu/Gal metrics table. High potency precision tends to increase as inference image concentration decreases. This indicates the Toxicity\(\ge\)2 model has been repurposed for retrieving highly potent compounds with Toxicity\(\ge\)5. The best model in terms of high potency AUC-ROC and AUC-PR is \([20\; \upmu {\text{M}} / 20\; \upmu {\text{M}} / 5]\). In this case, our method does not outperform the conventional method.DiscussionWe have presented an approach, improving on an existing image-based small molecule activity optimization pipeline, to specifically retrieve highly potent compounds in a biological assay. We start with training a moderate-potency model with \(20\; \upmu {\text{M}}\) cell painting images to classify compounds with pIC50 at a potency threshold low enough so that there are still plenty of positive examples to train effectively. Then, we repurpose that well-performing model for higher potency classification, by performing inference using lower concentration images as input. In terms of application in the drug discovery pipeline, being able to classify highly potent compounds accurately can help prioritizing hits from screening for experimental follow-up based on potency. It can also help deprioritizing compounds with potent off-target activities in the hit-triaging phase. However, it should be mentioned that the improvement in retrieval of highly potent compounds with this approach comes at a cost for data generation since to benefit from our approach, additional cell painting images of different concentrations are required.We highlighted two points that our approach can achieve. Hypothesis 1 is that a good moderate-potency model can be repurposed to specifically retrieve compounds with higher potency, by using features from low concentration images for inference without retraining the model. We assess this point by using precision when classifying a highly potent compound, on 57 assays and 4 additional assays in the Case Studies section. We found that this behavior can be observed in almost all assays we tested on, with the majority of them being cell proliferation assays. Although using low concentration images for inference retrieves fewer active compounds, these compounds tend to be highly potent.Hypothesis 2 was that if data imbalance adversely affected a high potency model training, our approach could outperform the high potency model (conventional method) in classifying highly potent compounds. We assessed this point on the same selection of assays as above, using AUC-ROC and corrected AUC-PR as metrics. We found that this point holds for around 65\(\%\) to \(75\%\) of those assays. Overall, AUC-ROC scores increase by around 0.1 to 0.2, and AUC-PR scores increase by around 0.2 to 0.5, indicating an improvement over the conventional method in the majority of assays. Our approach can serve as a replacement for a conventional high potency model when activity labels for training are scarce.

Hot Topics

Related Articles