Inverse mapping of quantum properties to structures for chemical space of small organic molecules

Kulik, H. J. et al. Roadmap on machine learning in electronic structure. Electron. Struct. 4, 023004 (2022).Article 
ADS 
CAS 

Google Scholar 
Sadybekov, A. V. & Katritch, V. Computational approaches streamlining drug discovery. Nature 616, 673–685 (2023).Article 
ADS 
CAS 
PubMed 

Google Scholar 
von Lilienfeld, O., Müller, K. & Tkatchenko, A. Exploring chemical compound space with quantum-based machine learning. Nat. Rev. Chem. 4, 347–358 (2020).Article 

Google Scholar 
Schütt, K. T., Arbabzadah, F., Chmiela, S., Müller, K. R. & Tkatchenko, A. Quantum-chemical insights from deep tensor neural networks. Nat. Commun. 8, 13890 (2017).Article 
ADS 
PubMed 
PubMed Central 

Google Scholar 
Gao, X., Ramezanghorbani, F., Isayev, O., Smith, J. S. & Roitberg, A. E. Torchani: A free and open source pytorch-based deep learning implementation of the ani neural network potentials. J. Chem. Inf. Model. 60, 3408–3415 (2020).Article 
CAS 
PubMed 

Google Scholar 
Bigi, F., Pozdnyakov, S. N. & Ceriotti, M. Wigner kernels: body-ordered equivariant machine learning without a basis. Preprint at https://arxiv.org/abs/2303.04124 (2023).Batzner, S. et al. E(3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials. Nat. Commun. 13, 2453 (2022).Article 
ADS 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Steinmann, S. N., Wang, Q. & Seh, Z. W. How machine learning can accelerate electrocatalysis discovery and optimization. Mater. Horiz. 10, 393–406 (2023).Article 
CAS 
PubMed 

Google Scholar 
Dreiman, G. H. S., Bictash, M., Fish, P., Griffin, L. D. & Svensson, F. Changing the hts paradigm: Ai-driven iterative screening for hit finding. Slas Discov. 26, 257–262 (2020).Article 
PubMed 
PubMed Central 

Google Scholar 
Jansen, J. et al. Biased complement diversity selection for effective exploration of chemical space in hit-finding campaigns. J. Chem. Inf. Model. 59, 1709–1714 (2019).Article 
CAS 
PubMed 

Google Scholar 
Paricharak, S. et al. Data-driven approaches used for compound library design, hit triage and bioactivity modeling in high-throughput screening. Brief. Bioinforma. 19, 277–285 (2016).
Google Scholar 
Riniker, S., Wang, Y., Jenkins, J. & Landrum, G. Using information from historical high-throughput screens to predict active compounds. J. Chem. Inf. Model. 54, 1880–91 (2014).Article 
CAS 
PubMed 

Google Scholar 
Ahmed, L. et al. Efficient iterative virtual screening with apache spark and conformal prediction. J. Cheminformatics 10, 8 (2018).Article 

Google Scholar 
Helal, K. Y., Maciejewski, M., Gregori-Puigjané, E., Glick, M. & Wassermann, A. Public domain hts fingerprints: Design and evaluation of compound bioactivity profiles from pubchem’s bioassay repository. J. Chem. Inf. Model. 56 2, 390–398 (2016).Article 

Google Scholar 
Beresini, M. et al. Small-molecule library subset screening as an aid for accelerating lead identification. J. Biomol. Screen. 19, 758–770 (2014).Article 
PubMed 

Google Scholar 
Sanchez-Lengeling, B. & Aspuru-Guzik, A. Inverse molecular design using machine learning: Generative models for matter engineering. Science 361, 360–365 (2018).Article 
ADS 
CAS 
PubMed 

Google Scholar 
Zunger, A. Inverse design in search of materials with target functionalities. Nat. Rev. Chem. 2, 0121 (2018).Article 
ADS 
CAS 

Google Scholar 
Kim, K. et al. Deep-learning-based inverse design model for intelligent discovery of organic molecules. npj Comput. Mater. 4, 67 (2018).Article 
ADS 

Google Scholar 
Chen, Y. et al. Deep generative model for drug design from protein target sequence. J. Cheminformatics 15, 38 (2023).Article 
CAS 

Google Scholar 
Lee, J. et al. Machine learning-based inverse design methods considering data characteristics and design space size in materials design and manufacturing: a review. Mater. Horiz. 10, 5436–5456 (2023).Article 
CAS 
PubMed 

Google Scholar 
Moret, M. et al. Leveraging molecular structure and bioactivity with chemical language models for de novo drug design. Nat. Commun. 14, 114 (2023).Article 
ADS 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Lin, J. et al. Machine learning accelerates the investigation of targeted mofs: Performance prediction, rational design and intelligent synthesis. Nano Today 49, 101802 (2023).Article 

Google Scholar 
Noh, J., Gu, G. H., Kim, S. & Jung, Y. Machine-enabled inverse design of inorganic solid materials: Promises and challenges. Chem. Sci. 11, 4871–4881 (2020).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Nigam, A., Pollice, R., Krenn, M., Gomes, Gd. P. & Aspuru-Guzik, A. Beyond generative models: superfast traversal, optimization, novelty, exploration and discovery (stoned) algorithm for molecules using selfies. Chem. Sci. 12, 7079–7090 (2021).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Nigam, A., Pollice, R. & Aspuru-Guzik, A. Parallel tempered genetic algorithm guided by deep neural networks for inverse molecular design. Digital Discov. 1, 390–404 (2022).Article 
CAS 

Google Scholar 
Anstine, D. M. & Isayev, O. Generative models as an emerging paradigm in the chemical sciences. J. Am. Chem. Soc. 145, 8736–8750 (2023).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Seo, S., Lim, J. & Kim, W. Y. Molecular generative model via retrosynthetically prepared chemical building block assembly. Adv. Sci. 10, 2206674 (2023).Article 
CAS 

Google Scholar 
Dollar, O., Joshi, N., Beck, D. A. C. & Pfaendtner, J. Attention-based generative models for de novo molecular design. Chem. Sci. 12, 8362–8372 (2021).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Gómez-Bombarelli, R. et al. Automatic chemical design using a data-driven continuous representation of molecules. ACS Cent. Sci. 4, 268–276 (2018).Article 
PubMed 
PubMed Central 

Google Scholar 
De Cao, N. & Kipf, T. MolGAN: an implicit generative model for small molecular graphs. Preprint at https://arxiv.org/abs/1805.11973 (2018).Olivecrona, M., Blaschke, T., Engkvist, O. & Chen, H. Molecular de novo design through deep reinforcement learning. J. Cheminformatics 9, 48 (2017).Article 

Google Scholar 
Kang, S. & Cho, K. Conditional molecular design with deep generative models. J. Chem. Inf. Model. 59, 43–52 (2018).Article 
PubMed 

Google Scholar 
Corso, G., Stärk, H., Jing, B., Barzilay, R. & Jaakkola, T. S. DiffDock: diffusion steps, twists, and turns for molecular docking. In Proc. 11th International Conference on Learning Representations https://openreview.net/forum?id=kKF8_K-mBbS (2023).Guimaraes, G. L., Sanchez-Lengeling, B., Outeiral, C., Farias, P. L. C. & Aspuru-Guzik, A. Objective-reinforced generative adversarial networks (organ) for sequence generation models. Preprint at https://arXiv.org/abs/1705.10843 (2018).Samanta, B. et al. Nevae: A deep generative model for molecular graphs. In Proceedings of the AAAI Conference on Artificial Intelligence, 33, 1110–1117 (2019).Li, Y., Zhang, L. & ming Liu, Z. Multi-objective de novo drug design with conditional graph generative model. J. Cheminformatics 10, 33 (2018).Article 

Google Scholar 
Maziarka, Ł. et al. Mol-cyclegan: a generative model for molecular optimization. J. Cheminformatics 12, 2 (2019).Article 

Google Scholar 
Zang, C. & Wang, F. Moflow: an invertible flow model for generating molecular graphs. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 617–626 (2020).Jin, W., Barzilay, R. & Jaakkola, T. Junction tree variational autoencoder for molecular graph generation. Preprint at https://arXiv.org/abs/1802.04364 (2019).Grover, A., Zweig, A. & Ermon, S. Graphite: Iterative generative modeling of graphs. Preprint at https://arXiv.org/abs/1803.10459 (2019).Xue, D. et al. Advances and challenges in deep generative models for de novo molecule generation. WIREs Comput. Mol. Sci. 9, e1395 (2019).Article 

Google Scholar 
Gebauer, N. W. A., Gastegger, M., Hessmann, S. S. P., Müller, K.-R. & Schütt, K. T. Inverse design of 3d molecular structures with conditional generative neural networks. Nat. Commun. 13, 973 (2022).Article 
ADS 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Hoogeboom, E., Satorras, V. G., Vignac, C. & Welling, M. Equivariant diffusion for molecule generation in 3d. Preprint at https://arXiv.org/abs/2203.17003 (2022).Xie, T., Fu, X., Ganea, O.-E., Barzilay, R. & Jaakkola, T. S. Crystal diffusion variational autoencoder for periodic material generation. In International Conference on Learning Representations https://openreview.net/forum?id=03RLpj-tc_ (2022).Wu, L., Gong, C., Liu, X., Ye, M. & Liu, Q. Diffusion-based molecule generation with informative prior bridges. In Advances in Neural Information Processing Systems https://openreview.net/forum?id=TJUNtiZiTKE (2022).Guan, J.et al. 3d equivariant diffusion for target-aware molecule generation and affinity prediction. In The Eleventh International Conference on Learning Representations https://openreview.net/forum?id=kJqXEPXMsE0 (2023).Xu, M. et al. Geodiff: A geometric diffusion model for molecular conformation generation. In International Conference on Learning Representations https://openreview.net/forum?id=PzcvxEMzvQC (2022).Hiener, D. C. & Hutchison, G. R. Pareto optimization of oligomer polarizability and dipole moment using a genetic algorithm. J. Phys. Chem. A 126, 2750–2760 (2022).Article 
CAS 
PubMed 

Google Scholar 
Mannodi-Kanakkithodi, A., Pilania, G., Huan, T. D., Lookman, T. & Ramprasad, R. Machine learning strategy for accelerated design of polymer dielectrics. Sci. Rep. 6, 20952 (2016).Article 
ADS 
PubMed 
PubMed Central 

Google Scholar 
Yuan, Q., Santana-Bonilla, A., Zwijnenburg, M. A. & Jelfs, K. E. Molecular generation targeting desired electronic properties via deep generative models. Nanoscale 12, 6744–6758 (2020).Article 
CAS 
PubMed 

Google Scholar 
Westermayr, J., Gilkes, J., Barrett, R. & Maurer, R. J. High-throughput property-driven generative design of functional organic molecules. Nat. Comput. Sci. 3, 139–148 (2023).Article 
CAS 
PubMed 

Google Scholar 
Medrano Sandonas, L. et al. “Freedom of design” in chemical compound space: towards rational in silico design of molecules with targeted quantum-mechanical properties. Chem. Sci. 14, 10702–10717 (2023).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Góger, S., Medrano Sandonas, L., Müller, C. & Tkatchenko, A. Data-driven tailoring of molecular dipole polarizability and frontier orbital energies in chemical compound space. Phys. Chem. Chem. Phys. 25, 22211–22222 (2023).Article 
PubMed 
PubMed Central 

Google Scholar 
Hoja, J. et al. QM7-X, a comprehensive dataset of quantum-mechanical properties spanning the chemical space of small organic molecules. Sci. Data 8, 43 (2021).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
van der Maaten, L. & Hinton, G. Visualizing data using t-sne. J. Mach. Learn. Res. 9, 2579–2605 (2008).
Google Scholar 
Rincón, L., Alvarellos, J. E. & Almeida, R. Electron density, exchange-correlation density, and bond characterization from the perspective of the valence-bond theory. II. Numerical results. J. Chem. Phys. 122, 214103 (2005).Collins, T. C., Euwema, R. N., Stukel, D. J. & Wepfer, G. G. Valence electron density of states of znse obtained from an energy dependent exchange approximation. Int. J. Quantum Chem. 5, 77–85 (1970).Article 

Google Scholar 
Shao, H., Kumar, A. & Fletcher, P. T. The riemannian geometry of deep generative models. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 428–4288 (2018).Makri, S., Ortner, C. & Kermode, J. R. A preconditioning scheme for minimum energy path finding methods. J. Chem. Phys. 150, 094109 (2019).Article 
ADS 
PubMed 

Google Scholar 
Unke, O. et al. Spookynet: Learning force fields with electronic degrees of freedom and nonlocal effects. Nat. Commun. 12, 7273 (2021).Article 
ADS 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Schreiner, M., Bhowmik, A., Vegge, T., Jørgensen, P. B. & Winther, O. Neuralneb—neural networks can find reaction paths fast. Mach. Learn.: Sci. Technol. 3, 045022 (2022).ADS 

Google Scholar 
Vignac, C. & Frossard, P. Top-n: Equivariant set and graph generation without exchangeability. In International Conference on Learning Representations https://openreview.net/forum?id=-Gk_IPJWvk (2022).Zhu, X., Thompson, K. & Martinez, T. Geodesic interpolation for reaction pathways. J. Chem. Phys. 150, 164103 (2019).Article 
ADS 
PubMed 

Google Scholar 
Medrano Sandonas, L. et al. Dataset for quantum-mechanical exploration of conformers and solvent effects in large drug-like molecules. Sci. Data 11, 742 (2024).Wu, Z. et al. Moleculenet: a benchmark for molecular machine learning. Chem. Sci. 9, 513–530 (2018).Article 
CAS 
PubMed 

Google Scholar 
Sorkun, M. C., Khetan, A. & Er, S. Aqsoldb, a curated reference set of aqueous solubility and 2d descriptors for a diverse set of compounds. Sci. Data 6, 143 (2019).Article 
PubMed 
PubMed Central 

Google Scholar 
Cremer, J., Medrano Sandonas, L., Tkatchenko, A., Clevert, D.-A. & De Fabritiis, G. Equivariant graph neural networks for toxicity prediction. Chem. Res. Toxicol. 36, 1561–1573 (2023).CAS 
PubMed 
PubMed Central 

Google Scholar 
Kingma, D. P. & Welling, M. Auto-encoding variational bayes. Preprint at https://arXiv.org/abs/1312.6114 (2022).Rupp, M., Tkatchenko, A., Müller, K.-R. & von Lilienfeld, O. A. Fast and accurate modeling of molecular atomization energies with machine learning. Phys. Rev. Lett. 108, 058301 (2012).Article 
ADS 
PubMed 

Google Scholar 
Montavon, G. et al. Machine learning of molecular electronic properties in chemical compound space. New J. Phys. 15, 095003 (2013).Article 
ADS 
CAS 

Google Scholar 
Dokmanic, I., Parhizkar, R., Ranieri, J. & Vetterli, M. Euclidean distance matrices: Essential theory, algorithms, and applications. IEEE Signal Process. Mag. 32, 12–30 (2015).Article 
ADS 

Google Scholar 
Hoffmann, M. & Noé, F. Generating valid euclidean distance matrices. Preprint at https://arXiv.org/abs/1910.03131 (2019).O’Boyle, N. M. et al. Open babel: An open chemical toolbox. J. Cheminformatics 3, 1–14 (2011).
Google Scholar 
Seifert, G., Porezag, D. & Frauenheim, T. Calculations of molecules, clusters, and solids with a simplified LCAO-DFT-LDA scheme. Int. J. Quantum Chem. 58, 185–192 (1996).Article 
CAS 

Google Scholar 
Gaus, M., Cui, Q. & Elstner, M. DFTB3: Extension of the self-consistent-charge density-functional tight-binding method (SCC-DFTB). J. Chem. Theory Comput. 7, 931–948 (2011).Article 
CAS 

Google Scholar 
Tkatchenko, A., DiStasio Jr, R. A., Car, R. & Scheffler, M. Accurate and efficient method for many-body van der waals interactions. Phys. Rev. Lett. 108, 236402 (2012).Article 
ADS 
PubMed 

Google Scholar 
Stöhr, M., Michelitsch, G. S., Tully, J. C., Reuter, K. & Maurer, R. J. Communication: Charge-population based dispersion interactions for molecules and materials. J. Chem. Phys. 144, 151101 (2016).Article 
ADS 
PubMed 

Google Scholar 
Perdew, J. P., Ernzerhof, M. & Burke, K. Rationale for mixing exact exchange with density functional approximations. J. Chem. Phys. 105, 9982–9985 (1996).Article 
ADS 
CAS 

Google Scholar 
Adamo, C. & Barone, V. Toward reliable density functional methods without adjustable parameters: The PBE0 model. J. Chem. Phys. 110, 6158–6170 (1999).Article 
ADS 
CAS 

Google Scholar 
Ambrosetti, A., Reilly, A. M., DiStasio Jr, R. A. & Tkatchenko, A. Long-range correlation energy calculated from coupled atomic response functions. J. Chem. Phys. 140, 18A508 (2014).Article 
PubMed 

Google Scholar 
Havu, V., Blum, V., Havu, P. & Scheffler, M. Efficient O(N) integration for all-electron electronic structure calculation using numeric basis functions. J. Comput. Phys. 228, 8367–8379 (2009).Article 
ADS 
CAS 

Google Scholar 
Simonyan, K., Vedaldi, A. & Zisserman, A. Deep inside convolutional networks: Visualising image classification models and saliency maps. Preprint at https://arxiv.org/abs/1312.6034 (2014).Fallani, A., Medrano Sandonas, L. & Tkatchenko, A. Inverse mapping of quantum properties to structures for chemical space of small organic molecules. ZENODO https://doi.org/10.5281/zenodo.11537048 (2024).

Hot Topics

Related Articles