A bioactivity foundation model using pairwise meta-learning

Gaulton, A. et al. ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res. 40, D1100–D1107 (2012).Article 

Google Scholar 
Turon, G. et al. First fully-automated AI/ML virtual screening cascade implemented at a drug discovery centre in Africa. Nat. Commun. 14, 5736 (2023).Article 

Google Scholar 
Lin, X., Li, X. & Lin, X. A review on applications of computational methods in drug screening and design. Molecules 25, 1375 (2020).Article 

Google Scholar 
Tsou, L. K. et al. Comparative study between deep learning and QSAR classifications for TNBC inhibitors and novel GPCR agonist discovery. Sci. Rep. 10, 16771 (2020).Article 

Google Scholar 
Dara, S., Dhamercherla, S., Jadav, S. S., Babu, C. M. & Ahsan, M. J. Machine learning in drug discovery: a review. Artif. Intell. Rev. 55, 1947–1999 (2022).Article 

Google Scholar 
Lewis, R. A. A general method for exploiting QSAR models in lead optimization. J. Med. Chem. 48 5, 1638–48 (2005).Article 

Google Scholar 
Pan, X. et al. Deep learning for drug repurposing: methods, databases, and applications. WIREs Comput. Mol. Sci. 12, e1597 (2022).Article 

Google Scholar 
Vamathevan, J. et al. Applications of machine learning in drug discovery and development. Nat. Rev. Drug Discov. 18, 463–477 (2019).Article 

Google Scholar 
Gilson, M. K. & Zhou, H.-X. Calculation of protein–ligand binding affinities. Annu. Rev. Biophys. Biomol. Struct. 36, 21–42 (2007).Article 

Google Scholar 
Mobley, D. L. & Gilson, M. K. Predicting binding free energies: frontiers and benchmarks. Annu. Rev. Biophys. 46, 531–558 (2017).Article 

Google Scholar 
Lu, C. et al. OPLS4: improving force field accuracy on challenging regimes of chemical space. J. Chem. Theory Comput. 17, 4291–4300 (2021).Article 

Google Scholar 
Jorgensen, W. L. Efficient drug lead discovery and optimization. Acc. Chem. Res. 42, 724–733 (2009).Article 

Google Scholar 
Bollini, M. et al. Computationally-guided optimization of a docking hit to yield catechol diethers as potent anti-HIV agents. J. Med. Chem. 54, 8582–8591 (2011).Article 

Google Scholar 
Mortier, J. et al. Computationally empowered workflow identifies novel covalent allosteric binders for KRASG12C. ChemMedChem 15, 827–832 (2020).Article 

Google Scholar 
Lovering, F. et al. Imidazotriazines: spleen tyrosine kinase (Syk) inhibitors identified by free-energy perturbation (FEP). ChemMedChem 11, 217–233 (2016).Article 

Google Scholar 
Goh, G. B., Hodas, N. O. & Vishnu, A. Deep learning for computational chemistry. J. Comput. Chem. 38, 1291–1307 (2017).Article 

Google Scholar 
Kao, P.-Y., Kao, S.-M., Huang, N.-L. & Lin, Y.-C. Toward drug-target interaction prediction via ensemble modeling and transfer learning. In Proc. 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) (eds Yufei, H. et al.) 2384–2391 (IEEE, 2021).da Silva Simões, R., Maltarollo, V. G., Oliveira, P. R. & Honório, K. M. Transfer and multi-task learning in QSAR modeling: advances and challenges. Front. Pharmacol. 9, 74 (2018).Article 

Google Scholar 
Jiménez-Luna, J. et al. DeltaDelta neural networks for lead optimization of small molecule potency. Chem. Sci. 10, 10911–10918 (2019).Article 

Google Scholar 
McNutt, A. T. & Koes, D. R. Improving ΔΔG predictions with a multitask convolutional Siamese network. J. Chem. Inf. Model. 62, 1819–1829 (2022).Article 

Google Scholar 
Yu, J. et al. Computing the relative binding affinity of ligands based on a pairwise binding comparison network. Nat. Comput. Sci. 3, 860–872 (2023).Article 

Google Scholar 
Eckmann, P., Anderson, J., Gilson, M. K. & Yu, R. Target-free compound activity prediction via few-shot learning. Preprint at https://arxiv.org/abs/2311.16328 (2023).Martin, E. J. et al. All-assay-Max2 pQSAR: activity predictions as accurate as four-concentration IC50s for 8558 Novartis assays. J. Chem. Inf. Model. 59, 4450–4459 (2019).Article 

Google Scholar 
Stanley, M. et al. FS-Mol: a few-shot learning dataset of molecules. In Proc. 35th Conference on Neural Information Processing Systems Datasets and Benchmarks Track (eds Vanschoren, J. & Yeung, S.) 1 (Curran Associates, 2021).Chen, W., Tripp, A. & Hernández-Lobato, J. M. Meta-learning adaptive deep kernel Gaussian processes for molecular property prediction. In Proc. 11th International Conference on Learning Representations (eds Katja, H. et al.) (ICLR, 2022).Lee, E., Yoo, J., Lee, H. & Hong, S. MetaDTA: meta-learning-based drug-target binding affinity prediction. In Proc. ICLR2022 Machine Learning for Drug Discovery (eds Katja, H. et al.) (ICLR, 2022).Olier, I. et al. Meta-QSAR: a large-scale application of meta-learning to drug design and discovery. Mach. Learn. 107, 285–311 (2018).Article 
MathSciNet 

Google Scholar 
Nguyen, C. Q., Kreatsoulas, C. & Branson, K. M. Meta-learning GNN initializations for low-resource molecular property prediction. In Proc. 4th Lifelong Machine Learning Workshop at ICML 2020 (eds David, B. et al.) (PMLR, 2020).Buffelli, D. & Vandin, F. A meta-learning approach for graph representation learning in multi-task settings. In Proc. 2022 International Joint Conference on Neural Networks (IJCNN) (eds Alessandro, S. et al.) 1–8 (IEEE, 2022).Wang, Y., Abuduweili, A., Yao, Q. & Dou, D. Property-aware relation networks for few-shot molecular property prediction. In Proc. Advances in Neural Information Processing Systems 34 (eds Ranzato, M. et al.) 17441–17454 (Curran Associates, 2021).Bubeck, S. et al. Sparks of artificial general intelligence: early experiments with GPT-4. Preprint at https://arxiv.org/abs/2303.12712 (2023).Ouyang, L. et al. Training language models to follow instructions with human feedback. In Proc. Advances in Neural Information Processing Systems 35 (eds Koyejo, S. et al.) 27730–27744 (Curran Associates, 2022).Lu, J., Batra, D., Parikh, D. & Lee, S. ViLBERT: Pretraining task-agnostic visiolinguistic representations for vision-and-language tasks. In Proc. Advances in Neural Information Processing Systems 32 (eds Wallach, H. et al.) (Curran Associates, 2019).Brown, T. et al. Language models are few-shot learners. In Proc. Advances in Neural Information Processing Systems 33 (eds Larochelle, H. et al.) 1877–1901 (Curran Associates, 2020).Radford, A. et al. Learning transferable visual models from natural language supervision. In Proc. International Conference on Machine Learning 8748–8763 (PMLR, 2021).Altae-Tran, H., Ramsundar, B., Pappu, A. S. & Pande, V. Low data drug discovery with one-shot learning. ACS Cent. Sci. 3, 283–293 (2017).Article 

Google Scholar 
Zhang, Y. et al. Similarity-based pairing improves efficiency of Siamese neural networks for regression tasks and uncertainty quantification. J. Cheminform. 15, 75 (2023).Article 

Google Scholar 
Tynes, M. et al. Pairwise difference regression: a machine learning meta-algorithm for improved prediction and uncertainty quantification in chemical search. J. Chem. Inf. Model. 61, 3846–3857 (2021).Article 

Google Scholar 
Liu, T., Lin, Y., Wen, X., Jorissen, R. N. & Gilson, M. K. BindingDB: a web-accessible database of experimentally determined protein–ligand binding affinities. Nucleic Acids Res. 35, D198–D201 (2007).Article 

Google Scholar 
Finn, C., Abbeel, P. & Levine, S. Model-agnostic meta-learning for fast adaptation of deep networks. In Proc. International Conference on Machine Learning, 1126–1135 (PMLR, 2017).Snell, J., Swersky, K. & Zemel, R. Prototypical networks for few-shot learning. In Proc. Advances in Neural Information Processing Systems 30 (eds Guyon, I. et al.) (Curran Associates, 2017).Tang, J. et al. Making sense of large-scale kinase inhibitor bioactivity data sets: a comparative and integrative analysis. J. Chem. Inf. Model. 54, 735–743 (2014).Article 

Google Scholar 
Davis, M. I. et al. Comprehensive analysis of kinase inhibitor selectivity. Nat. Biotechnol. 29, 1046–1051 (2011).Article 

Google Scholar 
Pei, Q. et al. Breaking the barriers of data scarcity in drug–target affinity prediction. Brief. Bioinform. 24, bbad386 (2023).Article 

Google Scholar 
Nguyen, T. et al. GraphDTA: predicting drug–target binding affinity with graph neural networks. Bioinformatics 37, 1140–1147 (2021).Article 

Google Scholar 
Huang, K. et al. DeepPurpose: a deep learning library for drug–target interaction prediction. Bioinformatics 36, 5545–5547 (2020).Article 

Google Scholar 
Wang, L. et al. Accurate and reliable prediction of relative ligand binding potency in prospective drug discovery by way of a modern free-energy calculation protocol and force field. J. Am. Chem. Soc. 137, 2695–2703 (2015).Article 

Google Scholar 
Schindler, C. E. et al. Large-scale assessment of binding free energy calculations in active drug discovery projects. J. Chem. Inf. Model. 60, 5457–5474 (2020).Article 

Google Scholar 
Ross, G. A. et al. The maximal and current accuracy of rigorous protein-ligand binding free energy calculations. Commun. Chem. 6, 222 (2023).Article 

Google Scholar 
Schöning-Stierand, K. et al. ProteinsPlus: a comprehensive collection of web-based molecular modeling tools. Nucleic Acids Res. 50, W611–W615 (2022).Article 

Google Scholar 
Yang, W. et al. Genomics of Drug Sensitivity in Cancer (GDSC): a resource for therapeutic biomarker discovery in cancer cells. Nucleic Acids Res. 41, D955–D961 (2012).Article 

Google Scholar 
He, K., Zhang, X., Ren, S. & Sun, J. Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. In Proc. IEEE International Conference on Computer Vision (eds Ruzena, B. et al.) 1026–1034 (IEEE, 2015).Zhou, G. et al. Uni-Mol: a universal 3D molecular representation learning framework. In Proc. 11th International Conference on Learning Representations (eds Yan, L. et al.) (ICLR, 2023).Xia, J. et al. Mole-BERT: rethinking pre-training graph neural networks for molecules. In Proc. 11th International Conference on Learning Representations (eds Yan, L. et al.) (ICLR, 2023).Wang, S., Guo, Y., Wang, Y., Sun, H. & Huang, J. SMILES-BERT: large scale unsupervised pre-training for molecular property prediction. In Proc. 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics (eds Xinghua, S. & Michael, B.) 429–436 (Association for Computing Machinery, 2019).Ju, W. et al. Few-shot molecular property prediction via hierarchically structured learning on relation graphs. Neural Netw. 163, 122–131 (2023).Article 

Google Scholar 
Seo, S., Choi, J., Park, S. & Ahn, J. Binding affinity prediction for protein–ligand complex using deep attention mechanism based on intermolecular interactions. BMC Bioinform. 22, 542 (2021).Article 

Google Scholar 
Stepniewska-Dziubinska, M. M., Zielenkiewicz, P. & Siedlecki, P. Development and evaluation of a deep learning model for protein–ligand binding affinity prediction. Bioinformatics 34, 3666–3674 (2018).Article 

Google Scholar 
Jiménez, J., Skalic, M., Martinez-Rosell, G. & De Fabritiis, G. Kdeep: protein–ligand absolute binding affinity prediction via 3D-convolutional neural networks. J. Chem. Inf. Model. 58, 287–296 (2018).Article 

Google Scholar 
Zheng, L., Fan, J. & Mu, Y. OnionNet: a multiple-layer intermolecular-contact-based convolutional neural network for protein–ligand binding affinity prediction. ACS Omega 4, 15956–15965 (2019).Article 

Google Scholar 
Jiang, D. et al. InteractionGraphNet: a novel and efficient deep graph representation learning framework for accurate protein–ligand interaction predictions. J. Med. Chem. 64, 18209–18232 (2021).Article 

Google Scholar 
Öztürk, H., Özgür, A. & Ozkirimli, E. DeepDTA: deep drug–target binding affinity prediction. Bioinformatics 34, i821–i829 (2018).Article 

Google Scholar 
Zhao, Q., Xiao, F., Yang, M., Li, Y. & Wang, J. AttentionDTA: prediction of drug–target binding affinity using attention model. In Proc. 2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) (eds Jinbo, B & John, H. R.) 64–69 (IEEE, 2019).Yang, Z., Zhong, W., Zhao, L. & Chen, C. Y.-C. MGraphDTA: deep multiscale graph neural network for explainable drug–target binding affinity prediction. Chem. Sci. 13, 816–833 (2022).Article 

Google Scholar 
Lin, S., Shi, C. & Chen, J. GeneralizedDTA: combining pre-training and multi-task learning to predict drug-target binding affinity for unknown drug discovery. BMC Bioinform. 23, 367 (2022).Article 

Google Scholar 
Yuan, W., Chen, G. & Chen, C. Y.-C. FusionDTA: attention-based feature polymerizer and knowledge distillation for drug-target binding affinity prediction. Brief. Bioinform. 23, bbab506 (2022).Article 

Google Scholar 
Lee, I., Keum, J. & Nam, H. DeepConv-DTI: prediction of drug-target interactions via deep learning with convolution on protein sequences. PLoS Comput. Biol. 15, e1007129 (2019).Article 

Google Scholar 
Seidl, P., Vall, A., Hochreiter, S. & Klambauer, G. Enhancing activity prediction models in drug discovery with the ability to understand human language. In Proc. 40th International Conference on Machine Learning (eds Krause, A. et al.) 30458–30490 (PMLR, 2023).Huang, K. et al. Therapeutics data commons: machine learning datasets and tasks for drug discovery and development. In Proc. 35th Conference on Neural Information Processing Systems Datasets and Benchmarks Track. 1 (eds J. Vanschoren and S. Yeung) (Curran Associates, 2021).Wu, Z. et al. MoleculeNet: a benchmark for molecular machine learning. Chem. Sci. 9, 513–530 (2018).Article 

Google Scholar 
Rifaioglu, A. S. et al. DEEPScreen: high performance drug–target interaction prediction with convolutional neural networks using 2-D structural compound representations. Chem. Sci. 11, 2531–2557 (2020).Article 

Google Scholar 
Li, H., Zhao, D. & Zeng, J. KPGT: knowledge-guided pre-training of graph transformer for molecular property prediction. In Proc. 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (eds Zhang, A. & Rangwala, H.) 857–867 (Association for Computing Machinery, 2022).Rong, Y. et al. Self-supervised graph transformer on large-scale molecular data. In Proc. Advances in Neural Information Processing Systems 33 (eds Larochelle, H. et al.) 12559–12571 (Curran Associates, 2020).Ross, J. et al. Large-scale chemical language representations capture molecular structure and properties. Nat. Mach. Intell. 4, 1256–1264 (2022).Article 

Google Scholar 
Rogers, D. & Hahn, M. Extended-connectivity fingerprints. J. Chem. Inf. Model. 50, 742–754 (2010).Article 

Google Scholar 
Raghu, A., Raghu, M., Bengio, S. & Vinyals, O. Rapid learning or feature reuse? Towards understanding the effectiveness of MAML. In Proc. International Conference on Learning Representations (eds Alexander, R. & Cornell, T.) (ICLR, 2020).Bahdanau, D., Cho, K. & Bengio, Y. Neural machine translation by jointly learning to align and translate. In Proc. 3rd International Conference on Learning Representations (eds Bengio, Y. & LeCun, Y.) (ICLR, 2015).Janela, T. & Bajorath, J. Simple nearest-neighbour analysis meets the accuracy of compound potency predictions using complex machine learning models. Nat. Mach. Intell. 4, 1246–1255 (2022).Article 

Google Scholar 
Antoniou, A., Edwards, H. & Storkey, A. How to train your MAML. In Proc. 7th International Conference on Learning Representations (eds Tara, S. et al.) (ICLR, 2019).Patacchiola, M., Turner, J., Crowley, E. J., O’Boyle, M. & Storkey, A. J. Bayesian meta-learning for the few-shot setting via deep kernels. In Proc. Advances in Neural Information Processing Systems 33 (eds Larochelle, H. et al.) 16108–16118 (Curran Associates, 2020).Garnelo, M. et al. Conditional neural processes. In Proc. International Conference on Machine Learning (eds Jennifer, D. & Andreas, K.) 1704–1713 (PMLR, 2018).Breiman, L. Random forests. Mach. Learn. 45, 5–32 (2001).Article 

Google Scholar 
Peterson, L. E. K-nearest neighbor. Scholarpedia 4, 1883 (2009).Article 

Google Scholar 
Ralaivola, L., Swamidass, S. J., Saigo, H. & Baldi, P. Graph kernels for chemical informatics. Neural Netw. 18, 1093–1110 (2005).Article 

Google Scholar 
Feng, B. The data and checkpoint for ActFound. Figshare https://doi.org/10.6084/m9.figshare.24452680 (2023).Zhang, Z., Zhao, B., Xie, A., Bian, Y. & Zhou, S. Activity cliff prediction: dataset and benchmark. Preprint at https://arxiv.org/abs/2302.07541 (2023).Feng, B. Bfeng14/actfound: Actfound v0.0. Zenodo https://doi.org/10.5281/zenodo.11800155 (2024).Feng, B. A bioactivity foundation model using pairwise meta-learning. Code Ocean https://doi.org/10.24433/CO.4647958.v1 (2024).

Hot Topics

Related Articles