Butler, K. T., Davies, D. W., Cartwright, H., Isayev, O. & Walsh, A. Machine learning for molecular and materials science. Nature 559, 547–555 (2018).ArticleÂ
Google ScholarÂ
Wong, F. et al. Discovery of a structural class of antibiotics with explainable deep learning. Nature 626, 177–185 (2023).Li, J. et al. AI applications through the whole life cycle of material discovery. Matter 3, 393–432 (2020).ArticleÂ
Google ScholarÂ
Deng, J. et al. A systematic study of key elements underlying molecular property prediction. Nat. Commun. 14, 6395 (2023).ArticleÂ
Google ScholarÂ
Stokes, J. M. et al. A deep learning approach to antibiotic discovery. Cell 180, 688–702 (2020).ArticleÂ
Google ScholarÂ
Dowden, H. & Munro, J. Trends in clinical success rates and therapeutic focus. Nat. Rev. Drug Discov. 18, 495–496 (2019).ArticleÂ
Google ScholarÂ
Galson, S. et al. The failure to fail smartly. Nat. Rev. Drug Discov. 20, 259–260 (2021).ArticleÂ
Google ScholarÂ
Pyzer-Knapp, E. O. et al. Accelerating materials discovery using artificial intelligence, high performance computing and robotics. npj Comput. Mater. 8, 84 (2022).ArticleÂ
Google ScholarÂ
Schneider, G. Automating drug discovery. Nat. Rev. Drug Discov. 17, 97–113 (2018).ArticleÂ
Google ScholarÂ
Chen, T., Kornblith, S., Norouzi, M. & Hinton, G. A simple framework for contrastive learning of visual representations. In International Conference on Machine Learning 1597–1607 (PMLR, 2020).He, K. et al. Masked autoencoders are scalable vision learners. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 16000–16009 (IEEE, 2022).Dai, A. M. & Le, Q. V. Semi-supervised sequence learning. Adv. Neural Inf. Process. Syst. 28, 3079–3087 (2015).
Google ScholarÂ
Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. BERT: pre-training of deep bidirectional transformers for language understanding. In Proc. 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers) 4171–4186 (Association for Computational Linguistics, 2019).Wang, Y., Wang, J., Cao, Z. & Barati Farimani, A. Molecular contrastive learning of representations via graph neural networks. Nat. Mach. Intell. 4, 279–287 (2022).ArticleÂ
Google ScholarÂ
Moon, K., Im, H.-J. & Kwon, S. 3D graph contrastive learning for molecular property prediction. Bioinformatics 39, 371 (2023).ArticleÂ
Google ScholarÂ
Fang, Y. et al. Knowledge graph-enhanced molecular contrastive learning with functional prompt. Nat. Mach. Intell. 5, 542–553 (2023).Stärk, H. et al. 3D Infomax improves GNNs for molecular property prediction. In International Conference on Machine Learning 20479–20502 (PMLR, 2022).Liu, S. et al. Pre-training molecular graph representation with 3D geometry. In International Conference on Learning Representations Workshop on Geometrical and Topological Representation Learning https://openreview.net/pdf?id=xQUe1pOKPam (ICLR, 2022).Li, S., Zhou, J., Xu, T., Dou, D. & Xiong, H. GeomGCL: geometric graph contrastive learning for molecular property prediction. In Proc. AAAI Conference on Artificial Intelligence 4541–4549 (PKP Publishing Services, 2022).Zeng, X. et al. Accurate prediction of molecular properties and drug targets using a self-supervised image representation learning framework. Nat. Mach. Intell. 4, 1004–1016 (2022).ArticleÂ
Google ScholarÂ
Zhang, X.-C. et al. MG-BERT: leveraging unsupervised atomic representation learning for molecular property prediction. Brief. Bioinf. 22, bbab152 (2021).ArticleÂ
Google ScholarÂ
Ross, J. et al. Large-scale chemical language representations capture molecular structure and properties. Nat. Mach. Intell. 4, 1256–1264 (2022).ArticleÂ
Google ScholarÂ
Xia, J. et al. Mole-BERT: rethinking pre-training graph neural networks for molecules. In The Eleventh International Conference on Learning Representations https://openreview.net/pdf/21b1918178090348ffb159460ee696cfe8360dd2.pdf (ICLR, 2023).Rong, Y. et al. Self-supervised graph transformer on large-scale molecular data. Adv. Neural Inf. Process. Syst. 33, 12559–12571 (2020).
Google ScholarÂ
Fang, X. et al. Geometry-enhanced molecular representation learning for property prediction. Nat. Mach. Intell. 4, 127–134 (2022).ArticleÂ
Google ScholarÂ
Zhou, G. et al. Uni-Mol: a universal 3D molecular representation learning framework. In The Eleventh International Conference on Learning Representations https://openreview.net/pdf?id=IfFZr1gl0b (ICLR, 2023).Wang, H. et al. Scientific discovery in the age of artificial intelligence. Nature 620, 47–60 (2023).ArticleÂ
Google ScholarÂ
Zaidi, S. et al. Pre-training via denoising for molecular property prediction. In International Conference on Learning Representations https://openreview.net/pdf?id=tYIMtogyee (ICLR, 2023).Luo, S. et al. One transformer can understand both 2D & 3D molecular data. In The Eleventh International Conference on Learning Representations https://openreview.net/pdf?id=vZTp1oPV3PC (ICLR, 2023).Liu, S., Guo, H. & Tang, J. Molecular geometry pretraining with SE(3)-invariant denoising distance matching. In The Eleventh International Conference on Learning Representations https://openreview.net/pdf?id=CjTHVo1dvR (ICLR, 2023).Jiao, R., Han, J., Huang, W., Rong, Y. & Liu, Y. Energy-motivated equivariant pretraining for 3D molecular graphs. Proc. of the AAAI Conference on Artificial Intelligence 37, 8096–8104 (2023).ArticleÂ
Google ScholarÂ
Feng, R. et al. May the force be with you: unified force-centric pre-training for 3D molecular conformations. Adv. Neural Inf. Process. Syst. 36, 72750–72760 (2023).Thölke, P. & Fabritiis, G.D. Equivariant transformers for neural network based molecular potentials. In International Conference on Learning Representations https://openreview.net/pdf?id=zNHzqZ9wrRB (ICLR, 2022).Boltzmann, L. Studien uber das gleichgewicht der lebenden kraft. Wissen. Abh. 1, 49–96 (1868).
Google ScholarÂ
Chmiela, S. et al. Machine learning of accurate energy-conserving molecular force fields. Sci. Adv. 3, 1603015 (2017).ArticleÂ
Google ScholarÂ
Schütt, K. et al. SchNet: a continuous-filter convolutional neural network for modeling quantum interactions. Adv. Neural Inf. Process. Syst. 30, 992–1002 (Curran Associates, 2017).Chmiela, S. et al. Accurate global machine learning force fields for molecules with hundreds of atoms. Sci. Adv. 9, 0873 (2023).ArticleÂ
Google ScholarÂ
Chmiela, S., Sauceda, H. E., Müller, K.-R. & Tkatchenko, A. Towards exact molecular dynamics simulations with machine-learned force fields. Nat. Commun. 9, 3887 (2018).ArticleÂ
Google ScholarÂ
Wang, Y., Xu, C., Li, Z. & Barati Farimani, A. Denoise pretraining on nonequilibrium molecules for accurate and transferable neural potentials. J. Chem. Theory Comput. 19, 5077–5087 (2023).ArticleÂ
Google ScholarÂ
Smith, J. S., Isayev, O. & Roitberg, A. E. ANI-1: an extensible neural network potential with DFT accuracy at force field computational cost. Chem. Sci. 8, 3192–3203 (2017).ArticleÂ
Google ScholarÂ
Smith, J. S. et al. The ANI-1ccx and ANI-1x data sets, coupled-cluster and density functional theory properties for molecules. Sci. Data 7, 134 (2020).ArticleÂ
Google ScholarÂ
Ruddigkeit, L., Van Deursen, R., Blum, L. C. & Reymond, J.-L. Enumeration of 166 billion organic small molecules in the chemical universe database GDB-17. J. Chem. Inf. Model. 52, 2864–2875 (2012).ArticleÂ
Google ScholarÂ
Ramakrishnan, R., Dral, P. O., Rupp, M. & Von Lilienfeld, O. A. Quantum chemistry structures and properties of 134 kilo molecules. Sci. Data 1, 140022 (2014).ArticleÂ
Google ScholarÂ
Townshend, R. et al. ATOM3D: tasks on molecules in three dimensions. In Proc. Neural Information Processing Systems Track on Datasets and Benchmarks (2021).Landrum, G. et al. RDKit: a software suite for cheminformatics, computational chemistry, and predictive modeling. Greg Landrum (2013).Chmiela, S., Sauceda, H. E., Poltavsky, I., Müller, K.-R. & Tkatchenko, A. sGDML: constructing accurate and data efficient molecular force fields using machine learning. Comput. Phys. Commun. 240, 38–45 (2019).ArticleÂ
Google ScholarÂ
Nakata, M. & Shimazaki, T. PubChemQC project: a large-scale first-principles electronic structure database for data-driven chemistry. J. Chem. Inf. Model. 57, 1300–1308 (2017).ArticleÂ
Google ScholarÂ
Schütt, K. T., Sauceda, H. E., Kindermans, P.-J., Tkatchenko, A. & Müller, K.-R. SchNet—a deep learning architecture for molecules and materials. J. Chem. Phys. 148, 241722 (2018).ArticleÂ
Google ScholarÂ
Satorras, V. G., Hoogeboom, E. & Welling, M. E(n) equivariant graph neural networks. In International Conference on Machine Learning 9323–9332 (PMLR, 2021).Gasteiger, J., Groß, J. & Günnemann, S. Directional message passing for molecular graphs. In International Conference on Learning Representations https://openreview.net/pdf?id=B1eWbxStPH (ICLR, 2020).Gasteiger, J., Giri, S., Margraf, J.T. & Günnemann, S. Fast and uncertainty-aware directional message passing for non-equilibrium molecules. In Machine Learning for Molecules Workshop, NeurIPS (2020).Liu, Y. et al. Spherical message passing for 3D molecular graphs. In International Conference on Learning Representations https://openreview.net/pdf?id=givsRXsOt9r (ICLR, 2022).Schütt, K., Unke, O. & Gastegger, M. Equivariant message passing for the prediction of tensorial properties and molecular spectra. In International Conference on Machine Learning 9377–9388 (PMLR, 2021).Öztürk, H., Özgür, A. & Ozkirimli, E. DeepDTA: deep drug–target binding affinity prediction. Bioinformatics 34, 821–829 (2018).ArticleÂ
Google ScholarÂ
Rao, R. et al. Evaluating protein transfer learning with tape. Adv. Neural Inf. Process. Syst. 32, 9689–9701 (2019).
Google ScholarÂ
Elnaggar, A. et al. ProtTrans: toward understanding the language of life through self-supervised learning. IEEE Trans. Pattern Anal. Mach. Intell. 44, 7112–7127 (2021).ArticleÂ
Google ScholarÂ
Somnath, V. R., Bunne, C. & Krause, A. Multi-scale representation learning on proteins. Adv. Neural Inf. Process. Syst. 34, 25244–25255 (2021).
Google ScholarÂ
Wang, L., Liu, H., Liu, Y., Kurtin, J. & Ji, S. Learning hierarchical protein representations via complete 3D graph networks. In The Eleventh International Conference on Learning Representations https://openreview.net/forum?id=9X-hgLDLYkQ (ICLR, 2022).Feng, S. MOL_LMDB. figshare https://doi.org/10.6084/m9.figshare.24961485.v1 (2024).Ramakrishnan, R., Dral, P., Rupp, M. & Anatole von Lilienfeld, O. Quantum chemistry structures and properties of 134 kilo molecules. figshare https://doi.org/10.6084/m9.figshare.c.978904.v5 (2014).Townshend, R. J. L. ATOM3D: ligand binding affinity (LBA) dataset. Zenodo https://doi.org/10.5281/zenodo.4914718 (2021).Ni, Y. Source data for figures in ‘Pre-training with fractional denoising to enhance molecular property prediction’. figshare https://doi.org/10.6084/m9.figshare.25902679.v1 (2024).Feng, S. Pre-training with fractional denoising to enhance molecular property prediction. Zenodo https://doi.org/10.5281/zenodo.12697467 (2024).