Pre-training with fractional denoising to enhance molecular property prediction

Butler, K. T., Davies, D. W., Cartwright, H., Isayev, O. & Walsh, A. Machine learning for molecular and materials science. Nature 559, 547–555 (2018).Article

Google Scholar
Wong, F. et al. Discovery of a structural class of antibiotics with explainable deep learning. Nature 626, 177–185 (2023).Li, J. et al. AI applications through the whole life cycle of material discovery. Matter 3, 393–432 (2020).Article

Google Scholar
Deng, J. et al. A systematic study of key elements underlying molecular property prediction. Nat. Commun. 14, 6395 (2023).Article

Google Scholar
Stokes, J. M. et al. A deep learning approach to antibiotic discovery. Cell 180, 688–702 (2020).Article

Google Scholar
Dowden, H. & Munro, J. Trends in clinical success rates and therapeutic focus. Nat. Rev. Drug Discov. 18, 495–496 (2019).Article

Google Scholar
Galson, S. et al. The failure to fail smartly. Nat. Rev. Drug Discov. 20, 259–260 (2021).Article

Google Scholar
Pyzer-Knapp, E. O. et al. Accelerating materials discovery using artificial intelligence, high performance computing and robotics. npj Comput. Mater. 8, 84 (2022).Article

Google Scholar
Schneider, G. Automating drug discovery. Nat. Rev. Drug Discov. 17, 97–113 (2018).Article

Google Scholar
Chen, T., Kornblith, S., Norouzi, M. & Hinton, G. A simple framework for contrastive learning of visual representations. In International Conference on Machine Learning 1597–1607 (PMLR, 2020).He, K. et al. Masked autoencoders are scalable vision learners. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 16000–16009 (IEEE, 2022).Dai, A. M. & Le, Q. V. Semi-supervised sequence learning. Adv. Neural Inf. Process. Syst. 28, 3079–3087 (2015).
Google Scholar
Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. BERT: pre-training of deep bidirectional transformers for language understanding. In Proc. 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers) 4171–4186 (Association for Computational Linguistics, 2019).Wang, Y., Wang, J., Cao, Z. & Barati Farimani, A. Molecular contrastive learning of representations via graph neural networks. Nat. Mach. Intell. 4, 279–287 (2022).Article

Google Scholar
Moon, K., Im, H.-J. & Kwon, S. 3D graph contrastive learning for molecular property prediction. Bioinformatics 39, 371 (2023).Article

Google Scholar
Fang, Y. et al. Knowledge graph-enhanced molecular contrastive learning with functional prompt. Nat. Mach. Intell. 5, 542–553 (2023).Stärk, H. et al. 3D Infomax improves GNNs for molecular property prediction. In International Conference on Machine Learning 20479–20502 (PMLR, 2022).Liu, S. et al. Pre-training molecular graph representation with 3D geometry. In International Conference on Learning Representations Workshop on Geometrical and Topological Representation Learning https://openreview.net/pdf?id=xQUe1pOKPam (ICLR, 2022).Li, S., Zhou, J., Xu, T., Dou, D. & Xiong, H. GeomGCL: geometric graph contrastive learning for molecular property prediction. In Proc. AAAI Conference on Artificial Intelligence 4541–4549 (PKP Publishing Services, 2022).Zeng, X. et al. Accurate prediction of molecular properties and drug targets using a self-supervised image representation learning framework. Nat. Mach. Intell. 4, 1004–1016 (2022).Article

Google Scholar
Zhang, X.-C. et al. MG-BERT: leveraging unsupervised atomic representation learning for molecular property prediction. Brief. Bioinf. 22, bbab152 (2021).Article

Google Scholar
Ross, J. et al. Large-scale chemical language representations capture molecular structure and properties. Nat. Mach. Intell. 4, 1256–1264 (2022).Article

Google Scholar
Xia, J. et al. Mole-BERT: rethinking pre-training graph neural networks for molecules. In The Eleventh International Conference on Learning Representations https://openreview.net/pdf/21b1918178090348ffb159460ee696cfe8360dd2.pdf (ICLR, 2023).Rong, Y. et al. Self-supervised graph transformer on large-scale molecular data. Adv. Neural Inf. Process. Syst. 33, 12559–12571 (2020).
Google Scholar
Fang, X. et al. Geometry-enhanced molecular representation learning for property prediction. Nat. Mach. Intell. 4, 127–134 (2022).Article

Google Scholar
Zhou, G. et al. Uni-Mol: a universal 3D molecular representation learning framework. In The Eleventh International Conference on Learning Representations https://openreview.net/pdf?id=IfFZr1gl0b (ICLR, 2023).Wang, H. et al. Scientific discovery in the age of artificial intelligence. Nature 620, 47–60 (2023).Article

Google Scholar
Zaidi, S. et al. Pre-training via denoising for molecular property prediction. In International Conference on Learning Representations https://openreview.net/pdf?id=tYIMtogyee (ICLR, 2023).Luo, S. et al. One transformer can understand both 2D & 3D molecular data. In The Eleventh International Conference on Learning Representations https://openreview.net/pdf?id=vZTp1oPV3PC (ICLR, 2023).Liu, S., Guo, H. & Tang, J. Molecular geometry pretraining with SE(3)-invariant denoising distance matching. In The Eleventh International Conference on Learning Representations https://openreview.net/pdf?id=CjTHVo1dvR (ICLR, 2023).Jiao, R., Han, J., Huang, W., Rong, Y. & Liu, Y. Energy-motivated equivariant pretraining for 3D molecular graphs. Proc. of the AAAI Conference on Artificial Intelligence 37, 8096–8104 (2023).Article

Google Scholar
Feng, R. et al. May the force be with you: unified force-centric pre-training for 3D molecular conformations. Adv. Neural Inf. Process. Syst. 36, 72750–72760 (2023).Thölke, P. & Fabritiis, G.D. Equivariant transformers for neural network based molecular potentials. In International Conference on Learning Representations https://openreview.net/pdf?id=zNHzqZ9wrRB (ICLR, 2022).Boltzmann, L. Studien uber das gleichgewicht der lebenden kraft. Wissen. Abh. 1, 49–96 (1868).
Google Scholar
Chmiela, S. et al. Machine learning of accurate energy-conserving molecular force fields. Sci. Adv. 3, 1603015 (2017).Article

Google Scholar
Schütt, K. et al. SchNet: a continuous-filter convolutional neural network for modeling quantum interactions. Adv. Neural Inf. Process. Syst. 30, 992–1002 (Curran Associates, 2017).Chmiela, S. et al. Accurate global machine learning force fields for molecules with hundreds of atoms. Sci. Adv. 9, 0873 (2023).Article

Google Scholar
Chmiela, S., Sauceda, H. E., Müller, K.-R. & Tkatchenko, A. Towards exact molecular dynamics simulations with machine-learned force fields. Nat. Commun. 9, 3887 (2018).Article

Google Scholar
Wang, Y., Xu, C., Li, Z. & Barati Farimani, A. Denoise pretraining on nonequilibrium molecules for accurate and transferable neural potentials. J. Chem. Theory Comput. 19, 5077–5087 (2023).Article

Google Scholar
Smith, J. S., Isayev, O. & Roitberg, A. E. ANI-1: an extensible neural network potential with DFT accuracy at force field computational cost. Chem. Sci. 8, 3192–3203 (2017).Article

Google Scholar
Smith, J. S. et al. The ANI-1ccx and ANI-1x data sets, coupled-cluster and density functional theory properties for molecules. Sci. Data 7, 134 (2020).Article

Google Scholar
Ruddigkeit, L., Van Deursen, R., Blum, L. C. & Reymond, J.-L. Enumeration of 166 billion organic small molecules in the chemical universe database GDB-17. J. Chem. Inf. Model. 52, 2864–2875 (2012).Article

Google Scholar
Ramakrishnan, R., Dral, P. O., Rupp, M. & Von Lilienfeld, O. A. Quantum chemistry structures and properties of 134 kilo molecules. Sci. Data 1, 140022 (2014).Article

Google Scholar
Townshend, R. et al. ATOM3D: tasks on molecules in three dimensions. In Proc. Neural Information Processing Systems Track on Datasets and Benchmarks (2021).Landrum, G. et al. RDKit: a software suite for cheminformatics, computational chemistry, and predictive modeling. Greg Landrum (2013).Chmiela, S., Sauceda, H. E., Poltavsky, I., Müller, K.-R. & Tkatchenko, A. sGDML: constructing accurate and data efficient molecular force fields using machine learning. Comput. Phys. Commun. 240, 38–45 (2019).Article

Google Scholar
Nakata, M. & Shimazaki, T. PubChemQC project: a large-scale first-principles electronic structure database for data-driven chemistry. J. Chem. Inf. Model. 57, 1300–1308 (2017).Article

Google Scholar
Schütt, K. T., Sauceda, H. E., Kindermans, P.-J., Tkatchenko, A. & Müller, K.-R. SchNet—a deep learning architecture for molecules and materials. J. Chem. Phys. 148, 241722 (2018).Article

Google Scholar
Satorras, V. G., Hoogeboom, E. & Welling, M. E(n) equivariant graph neural networks. In International Conference on Machine Learning 9323–9332 (PMLR, 2021).Gasteiger, J., Groß, J. & Günnemann, S. Directional message passing for molecular graphs. In International Conference on Learning Representations https://openreview.net/pdf?id=B1eWbxStPH (ICLR, 2020).Gasteiger, J., Giri, S., Margraf, J.T. & Günnemann, S. Fast and uncertainty-aware directional message passing for non-equilibrium molecules. In Machine Learning for Molecules Workshop, NeurIPS (2020).Liu, Y. et al. Spherical message passing for 3D molecular graphs. In International Conference on Learning Representations https://openreview.net/pdf?id=givsRXsOt9r (ICLR, 2022).Schütt, K., Unke, O. & Gastegger, M. Equivariant message passing for the prediction of tensorial properties and molecular spectra. In International Conference on Machine Learning 9377–9388 (PMLR, 2021).Öztürk, H., Özgür, A. & Ozkirimli, E. DeepDTA: deep drug–target binding affinity prediction. Bioinformatics 34, 821–829 (2018).Article

Google Scholar
Rao, R. et al. Evaluating protein transfer learning with tape. Adv. Neural Inf. Process. Syst. 32, 9689–9701 (2019).
Google Scholar
Elnaggar, A. et al. ProtTrans: toward understanding the language of life through self-supervised learning. IEEE Trans. Pattern Anal. Mach. Intell. 44, 7112–7127 (2021).Article

Google Scholar
Somnath, V. R., Bunne, C. & Krause, A. Multi-scale representation learning on proteins. Adv. Neural Inf. Process. Syst. 34, 25244–25255 (2021).
Google Scholar
Wang, L., Liu, H., Liu, Y., Kurtin, J. & Ji, S. Learning hierarchical protein representations via complete 3D graph networks. In The Eleventh International Conference on Learning Representations https://openreview.net/forum?id=9X-hgLDLYkQ (ICLR, 2022).Feng, S. MOL_LMDB. figshare https://doi.org/10.6084/m9.figshare.24961485.v1 (2024).Ramakrishnan, R., Dral, P., Rupp, M. & Anatole von Lilienfeld, O. Quantum chemistry structures and properties of 134 kilo molecules. figshare https://doi.org/10.6084/m9.figshare.c.978904.v5 (2014).Townshend, R. J. L. ATOM3D: ligand binding affinity (LBA) dataset. Zenodo https://doi.org/10.5281/zenodo.4914718 (2021).Ni, Y. Source data for figures in ‘Pre-training with fractional denoising to enhance molecular property prediction’. figshare https://doi.org/10.6084/m9.figshare.25902679.v1 (2024).Feng, S. Pre-training with fractional denoising to enhance molecular property prediction. Zenodo https://doi.org/10.5281/zenodo.12697467 (2024).

Pre-training with fractional denoising to enhance molecular property prediction

Genome-scale models in human metabologenomics

With departments and courses facing closures UK chemistry needs a new hero | News

What makes songbirds different in their breeding cycles? – Functional Ecologists

A week of Bulk RNA-Seq at the University of Minnesota

University of Nebraska researchers using RNA sequencing for rapid hybrid development

Hot Topics

Genome-scale models in human metabologenomics

With departments and courses facing closures UK chemistry needs a new hero | News

What makes songbirds different in their breeding cycles? – Functional Ecologists

Related Articles

Balancing Act: Pregnancy and Bipolar Disorder

Cohesion at the cellular level: flexible yet stable

Gut bacteria influence responses to immunotherapy in patients with asbestos related cancer

Quick Links

Must Read

Genome-scale models in human metabologenomics

With departments and courses facing closures UK chemistry needs a new hero | News

What makes songbirds different in their breeding cycles? – Functional Ecologists

A week of Bulk RNA-Seq at the University of Minnesota

Popular Articles

Genome-scale models in human metabologenomics

With departments and courses facing closures UK chemistry needs a new hero | News

What makes songbirds different in their breeding cycles? – Functional Ecologists