Dataset for quantum-mechanical exploration of conformers and solvent effects in large drug-like molecules

Friesner, R. A. ab initio quantum chemistry: Methodology and applications. Proc. Natl. Acad. Sci. 102, 6648–6653 (2005).Article 
ADS 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Marzari, N., Ferretti, A. & Wolverton, C. Electronic-structure methods for materials design. Nat. Mater. 20, 736–749 (2021).Article 
ADS 
CAS 
PubMed 

Google Scholar 
Palazzesi, F., Grundl, M. A., Pautsch, A., Weber, A. & Tautermann, C. S. A fast ab initio predictor tool for covalent reactivity estimation of acrylamides. J. Chem. Inf. Model 59, 3565–3571 (2019).Article 
CAS 
PubMed 

Google Scholar 
Mihalovits, L. M., Ferenczy, G. G. & Keserũ, G. M. Affinity and selectivity assessment of covalent inhibitors by free energy calculations. J. Chem. Inf. Model 60, 6579–6594 (2020).Article 
CAS 
PubMed 

Google Scholar 
Hofmans, S. et al. Tozasertib analogues as inhibitors of necroptotic cell death. J. Medicinal Chem 61, 1895–1920 (2018).Article 
CAS 

Google Scholar 
Prasad, S., Huang, J., Zeng, Q. & Brooks, B. R. An explicit-solvent hybrid QM and MM approach for predicting pKa of small molecules in SAMPL6 challenge. J. Comput. Mol. Des. 32, 1191–1201 (2018).Article 
CAS 

Google Scholar 
Raghavachari, K. & Saha, A. Accurate composite and fragment-based quantum chemical models for large molecules. Chem. Rev. 115, 5643–5677 (2015).Article 
CAS 
PubMed 

Google Scholar 
Pruitt, S. R., Bertoni, C., Brorsen, K. R. & Gordon, M. S. Efficient and accurate fragmentation methods. Acc. Chem. Res. 47, 2786–2794 (2014).Article 
CAS 
PubMed 

Google Scholar 
Stewart, J. J. P. Optimization of parameters for semiempirical methods II. applications. J. Comput. Chem. 10, 221–264 (1989).Article 
CAS 

Google Scholar 
Seifert, G., Porezag, D. & Frauenheim, T. Calculations of molecules, clusters, and solids with a simplified LCAO-DFT-LDA scheme. Int. J. Quantum Chem. 58, 185–192 (1996).Article 
CAS 

Google Scholar 
Hourahine, B. et al. DFTB+, a software package for efficient approximate density functional theory based atomistic simulations. J. Chem. Phys 152, 124101 (2020).Article 
ADS 
CAS 
PubMed 

Google Scholar 
Bannwarth, C. et al. Extended tight-binding quantum chemistry methods. WIREs Comput. Mol. Sci. 11, e1493 (2021).Article 
CAS 

Google Scholar 
Bannwarth, C., Ehlert, S. & Grimme, S. GFN2-xTB—an accurate and broadly parametrized self-consistent tight-binding quantum chemical method with multipole electrostatics and density-dependent dispersion contributions. J. Chem. Theory Comput. 15, 1652–1671 (2019).Article 
CAS 
PubMed 

Google Scholar 
Smith, J. S., Isayev, O. & Roitberg, A. E. ANI-1: An extensible neural network potential with DFT accuracy at force field computational cost. Chem. Sci. 8, 3192–3203 (2017).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Chmiela, S., Sauceda, H. E., Müller, K.-R. & Tkatchenko, A. Towards exact molecular dynamics simulations with machine-learned force fields. Nat. Commun. 9, 3887 (2018).Article 
ADS 
PubMed 
PubMed Central 

Google Scholar 
Schütt, K. T., Arbabzadah, F., Chmiela, S., Müller, K. R. & Tkatchenko, A. Quantum-chemical insights from deep tensor neural networks. Nat. Commun. 8, 13890 (2017).Article 
ADS 
PubMed 
PubMed Central 

Google Scholar 
Unke, O. T. et al. Spookynet: Learning force fields with electronic degrees of freedom and nonlocal effects. Nat. Commun. 12, 7273 (2021).Article 
ADS 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Batzner, S. et al. E (3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials. Nat. Commun. 13, 2453 (2022).Article 
ADS 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Musaelian, A. et al. Learning local equivariant representations for large-scale atomistic dynamics. Nat. Commun. 14, 579 (2023).Article 
ADS 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Batatia, I. et al. (eds.) Advances in Neural Information Processing Systems, vol. 35, 11423–11436 (Curran Associates, Inc., 2022).Huang, B., von Rudorff, G. F. & von Lilienfeld, O. A. The central role of density functional theory in the AI age. Science 381, 170–175 (2023).Article 
ADS 
CAS 
PubMed 

Google Scholar 
Kulik, H. J. et al. Roadmap on machine learning in electronic structure. Electron. Struct 4, 023004 (2022).CAS 

Google Scholar 
Stöhr, M., Medrano Sandonas, L. & Tkatchenko, A. Accurate many-body repulsive potentials for density-functional tight binding from deep tensor neural networks. J. Phys. Chem. Lett 11, 6835–6843 (2020).Article 
PubMed 

Google Scholar 
Qiao, Z., Welborn, M., Anandkumar, A., Manby, F. R. & Miller, T. F. OrbNet: Deep learning for quantum chemistry using symmetry-adapted atomic-orbital features. J. Chem. Phys 153, 124111 (2020).Article 
ADS 
CAS 
PubMed 

Google Scholar 
Blum, L. C. & Reymond, J.-L. 970 million druglike small molecules for virtual screening in the chemical universe database GDB-13. J. Am. Chem. Soc. 131, 8732–8733 (2009).Article 
CAS 
PubMed 

Google Scholar 
Montavon, G. et al. Machine learning of molecular electronic properties in chemical compound space. New J. Phys. 15, 095003 (2013).Article 
ADS 
CAS 

Google Scholar 
Yang, Y. et al. Quantum mechanical static dipole polarizabilities in the QM7b and AlphaML showcase databases. Sci. Data 6, 152 (2019).Article 
PubMed 
PubMed Central 

Google Scholar 
Ruddigkeit, L., van Deursen, R., Blum, L. C. & Reymond, J.-L. Enumeration of 166 billion organic small molecules in the chemical universe database GDB-17. J. Chem. Inf. Model. 52, 2864–2875 (2012).Article 
CAS 
PubMed 

Google Scholar 
Ramakrishnan, R., Dral, P. O., Rupp, M. & von Lilienfeld, O. A. Quantum chemistry structures and properties of 134 kilo molecules. Sci. Data 1, 140022 (2014).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Hoja, J. et al. QM7-X, a comprehensive dataset of quantum-mechanical properties spanning the chemical space of small organic molecules. Sci. Data 8, 43 (2021).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Schütt, K. T., Sauceda, H. E., Kindermans, P.-J., Tkatchenko, A. & Müller, K.-R. SchNet – a deep learning architecture for molecules and materials. J. Chem. Phys 148, 241722 (2018).Article 
ADS 
PubMed 

Google Scholar 
Chmiela, S. et al. Accurate global machine learning force fields for molecules with hundreds of atoms. Sci. Adv. 9, eadf0873 (2023).Article 
PubMed 
PubMed Central 

Google Scholar 
Smith, J. S., Isayev, O. & Roitberg, A. E. ANI-1, A data set of 20 million calculated off-equilibrium conformations for organic molecules. Sci. Data 4, 170193 (2017).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Smith, J. S. et al. The ANI-1ccx and ANI-1x data sets, coupled-cluster and density functional theory properties for molecules. Sci. Data 7, 134 (2020).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Zubatyuk, R., Smith, J. S., Nebgen, B. T., Tretiak, S. & Isayev, O. Teaching a neural network to attach and detach electrons from molecules. Nat. Commun. 12, 4870 (2021).Article 
ADS 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Decherchi, S. & Cavalli, A. Thermodynamics and kinetics of drug-target binding by molecular simulation. Chem. Rev. 120, 12788–12833 (2020).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Hirata, F. Molecular theory of solvation, vol. 24 (Springer Science & Business Media, 2003).Gorges, J., Grimme, S., Hansen, A. & Pracht, P. Towards understanding solvation effects on the conformational entropy of non-rigid molecules. Phys. Chem. Chem. Phys. 24, 12249–12259 (2022).Article 
CAS 
PubMed 

Google Scholar 
Matczak, P. & Domagała, M. Heteroatom and solvent effects on molecular properties of formaldehyde and thioformaldehyde symmetrically disubstituted with heterocyclic groups C4H3Y (where Y= O–Po). J. Mol. Model. 23, 1–11 (2017).Article 
CAS 

Google Scholar 
Odey, M. O. et al. Unraveling the impact of polar solvation on the molecular geometry, spectroscopy (ft-ir, uv, nmr), reactivity (elf, nbo, homo-lumo) and antiviral inhibitory potential of cissampeline by molecular docking approach. Chem. Phys. Impact 7, 100346 (2023).Article 

Google Scholar 
Ensing, B., Meijer, E. J., Blöchl, P. & Baerends, E. J. Solvation effects on the sn 2 reaction between ch3cl and cl-in water. J. Phys. Chem. A 105, 3300–3310 (2001).Article 
CAS 

Google Scholar 
Klamt, A. Conductor-like screening model for real solvents: A new approach to the quantitative calculation of solvation phenomena. J. Phys. Chem 99, 2224–2235 (1995).Article 
CAS 

Google Scholar 
Ringe, S., Oberhofer, H., Hille, C., Matera, S. & Reuter, K. Function-space-based solution scheme for the size-modified poisson–boltzmann equation in full-potential DFT. J. Chem. Theory Comput. 12, 4052–4066 (2016).Article 
CAS 
PubMed 

Google Scholar 
Onufriev, A. V. & Case, D. A. Generalized born implicit solvent models for biomolecules. Annu. Rev. Biophys. 48, 275–296 (2019).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Xie, L. & Liu, H. The treatment of solvation by a generalized born model and a self-consistent charge-density functional theory-based tight-binding method. J. Comput. Chem 23, 1404–1415 (2002).Article 
CAS 
PubMed 

Google Scholar 
Isert, C., Atz, K., Jiménez-Luna, J. & Schneider, G. QMugs, quantum mechanical properties of drug-like molecules. Sci. Data 9, 273 (2022).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Chai, J.-D. & Head-Gordon, M. Systematic optimization of long-range corrected hybrid density functionals. J. Chem. Phys 128, 084106 (2008).Article 
ADS 
PubMed 

Google Scholar 
Stuke, A. et al. Atomic structures and orbital energies of 61,489 crystal-forming organic molecules. Sci. Data 7, 58 (2020).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Tkatchenko, A. & Scheffler, M. Accurate molecular van der Waals interactions from ground-state electron density and free-atom reference data. Phy. Rev. Lett. 102, 073005 (2009).Article 
ADS 

Google Scholar 
Sinstein, M. et al. Efficient implicit solvation method for full potential DFT. J. Chem. Theory Comput. 13, 5582–5603 (2017).Article 
CAS 
PubMed 

Google Scholar 
Axelrod, S. & Gómez-Bombarelli, R. GEOM, energy-annotated molecular conformations for property prediction and molecular generation. Sci. Data 9, 185 (2022).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Subramanian, G., Ramsundar, B., Pande, V. & Denny, R. A. Computational modeling of β-secretase 1 (BACE-1) inhibitors using ligand based approaches. J. Chem. Inf. Model. 56, 1936–1949 (2016).Article 
CAS 
PubMed 

Google Scholar 
Ehlert, S., Stahn, M., Spicher, S. & Grimme, S. Robust and efficient implicit solvation model for fast semiempirical methods. J. Chem. Theory Comput. 17, 4250–4261 (2021).Article 
CAS 
PubMed 

Google Scholar 
Barone, V. & Cossi, M. Quantum calculation of molecular energies and energy gradients in solution by a conductor solvent model. J. Phys. Chem. A 102, 1995–2001 (1998).Article 
CAS 

Google Scholar 
Eastman, P. et al. SPICE, A Dataset of Drug-like Molecules and Peptides for Training Machine Learning Potentials. Sci. Data 10, 11 (2023).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Pracht, P., Bohle, F. & Grimme, S. Automated exploration of the low-energy chemical space with fast quantum chemical methods. Phys. Chem. Chem. Phys. 22, 7169–7192 (2020).Article 
CAS 
PubMed 

Google Scholar 
Elstner, M. et al. Self-consistent-charge density-functional tight-binding method for simulations of complex materials properties. Phys. Rev. B 58, 7260–7268 (1998).Article 
ADS 
CAS 

Google Scholar 
Gaus, M., Cui, Q. & Elstner, M. DFTB3: Extension of the self-consistent-charge density-functional tight-binding method (SCC-DFTB). J. Chem. Theory Comput. 7, 931–948 (2011).Article 
CAS 

Google Scholar 
Tkatchenko, A., DiStasio, R. A. Jr, Car, R. & Scheffler, M. Accurate and efficient method for many-body van der Waals interactions. Phys. Rev. Lett. 108, 236402 (2012).Article 
ADS 
PubMed 

Google Scholar 
Ambrosetti, A., Reilly, A. M., DiStasio, R. A. Jr & Tkatchenko, A. Long-range correlation energy calculated from coupled atomic response functions. J. Chem. Phys 140, 18A508 (2014).Article 
PubMed 

Google Scholar 
Stöhr, M., Michelitsch, G. S., Tully, J. C., Reuter, K. & Maurer, R. J. Communication: Charge-population based dispersion interactions for molecules and materials. J. Chem. Phys 144, 151101 (2016).Article 
ADS 
PubMed 

Google Scholar 
Mortazavi, M., Brandenburg, J. G., Maurer, R. J. & Tkatchenko, A. Structure and stability of molecular crystals with many-body dispersion-inclusive density functional tight binding. J. Phys. Chem. Lett 9, 399–405 (2018).Article 
CAS 
PubMed 

Google Scholar 
Havu, V., Blum, V., Havu, P. & Scheffler, M. Efficient O(N) integration for all-electron electronic structure calculation using numeric basis functions. J. Comput. Phys 228, 8367–8379 (2009).Article 
ADS 
CAS 

Google Scholar 
Gaulton, A. et al. ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res 40, D1100–D1107 (2012).Article 
CAS 
PubMed 

Google Scholar 
Bickerton, G. R., Paolini, G. V., Besnard, J., Muresan, S. & Hopkins, A. L. Quantifying the chemical beauty of drugs. Nat. Chem. 4, 90–98 (2012).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Landrum, G. et al. RDKit: Open-source cheminformatics. https://www.rdkit.org (2020).Landrum, G. et al. rdkit/rdkit: 2020_03_1 (q1 2020) release https://doi.org/10.5281/zenodo.3732262 (2020).Halgren, T. A. Merck molecular force field. I. basis, form, scope, parameterization, and performance of MMFF94. J. Comput. Chem. 17, 490–519 (1996).Article 
CAS 

Google Scholar 
Halgren, T. A. Merck molecular force field. II. MMFF94 van der Waals and electrostatic parameters for intermolecular interactions. J. Comput. Chem 17, 520–552 (1996).Article 
CAS 

Google Scholar 
Halgren, T. A. Merck molecular force field. III. molecular geometries and vibrational frequencies for MMFF94. J. Comput. Chem. 17, 553–586 (1996).Article 
CAS 

Google Scholar 
Halgren, T. A. & Nachbar, R. B. Merck molecular force field. IV. conformational energies and geometries for MMFF94. J. Comput. Chem. 17, 587–615 (1996).Article 
CAS 

Google Scholar 
Halgren, T. A. Merck molecular force field. V. extension of MMFF94 using experimental data, additional computational data, and empirical rules. J. Comput. Chem 17, 616–641 (1996).Article 
CAS 

Google Scholar 
Cremer, J., Medrano Sandonas, L., Tkatchenko, A., Clevert, D.-A. & De Fabritiis, G. Equivariant graph neural networks for toxicity prediction. Chem. Res. Toxicol. 36, 1561–1573 (2023).CAS 
PubMed 
PubMed Central 

Google Scholar 
Bell, E. W. & Zhang, Y. DockRMSD: an open-source tool for atom mapping and RMSD calculation of symmetric molecules through graph isomorphism. J. Cheminformatics 11, 40 (2019).Article 

Google Scholar 
Gaus, M., Goez, A. & Elstner, M. Parametrization and benchmark of DFTB3 for organic molecules. J. Chem. Theory Comput. 9, 338–354 (2013).Article 
CAS 
PubMed 

Google Scholar 
Gaus, M., Lu, X., Elstner, M. & Cui, Q. Parameterization of DFTB3/3OB for sulfur and phosphorus for chemical and biological applications. J. Chem. Theory Comput. 10, 1518–1537 (2014).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Aradi, B., Hourahine, B. & Frauenheim, T. DFTB+, a sparse matrix-based implementation of the DFTB method. J. Phys. Chem. A 111, 5678–5684 (2007).Article 
CAS 
PubMed 

Google Scholar 
Larsen, A. H. et al. The atomic simulation environment—a python library for working with atoms. J. Phys. Condens. Matter 29, 273002 (2017).Article 

Google Scholar 
Perdew, J. P., Ernzerhof, M. & Burke, K. Rationale for mixing exact exchange with density functional approximations. J. Chem. Phys 105, 9982–9985 (1996).Article 
ADS 
CAS 

Google Scholar 
Adamo, C. & Barone, V. Toward reliable density functional methods without adjustable parameters: The PBE0 model. J. Chem. Phys. 110, 6158–6170 (1999).Article 
ADS 
CAS 

Google Scholar 
Ringe, S., Oberhofer, H. & Reuter, K. Transferable ionic parameters for first-principles Poisson-Boltzmann solvation calculations: Neutral solutes in aqueous monovalent salt solutions. J. Chem. Phys 146, 134103 (2017).Article 
ADS 
PubMed 

Google Scholar 
Blum, V. et al. Ab initio molecular simulations with numeric atom-centered orbitals. Comp. Phys. Commun. 180, 2175–2196 (2009).Article 
ADS 
CAS 

Google Scholar 
Ren, X. et al. Resolution-of-identity approach to Hartree–Fock, hybrid density functionals, RPA, MP2 and GW with numeric atom-centered orbital basis functions. New J. Phys. 14, 053020 (2012).Article 
ADS 

Google Scholar 
Medrano Sandonas, L. et al. Aquamarine: Quantum-mechanical exploration of conformers and solvent effects in large drug-like molecules. ZENODO https://doi.org/10.5281/zenodo.10208010 (2024).Ho, B. K. & Dill, K. A. Folding very short peptides using molecular dynamics. PLOS Comput. Biol. 2, 1–10 (2006).ADS 

Google Scholar 
Ringe, S. et al. Understanding cation effects in electrochemical CO2 reduction. Energy Environ. Sci. 12, 3001–3014 (2019).Article 
CAS 

Google Scholar 
Abidi, N., Lim, K. R. G., Seh, Z. W. & Steinmann, S. N. Atomistic modeling of electrocatalysis: Are we there yet? WIREs Comput. Mol. Sci. 11, e1499 (2021).Article 
CAS 

Google Scholar 
Gauthier, J. A. et al. Unified approach to implicit and explicit solvent simulations of electrochemical reaction energetics. J. Chem. Theory Comput. 15, 6895–6906 (2019).Article 
CAS 
PubMed 

Google Scholar 
Ringe, S., Hörmann, N. G., Oberhofer, H. & Reuter, K. Implicit solvation methods for catalysis at electrified interfaces. Chem. Rev. 122, 10777–10820 (2022).Article 
CAS 
PubMed 

Google Scholar 
Hawkins, P. C., Skillman, A. G., Warren, G. L., Ellingson, B. A. & Stahl, M. T. Conformer generation with omega: algorithm and validation using high quality structures from the protein databank and cambridge structural database. J. Chem. Inf. Model 50, 572–584 (2010).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Wang, S., Witek, J., Landrum, G. A. & Riniker, S. Improving conformer generation for small rings and macrocycles based on distance geometry and experimental torsional-angle preferences. J. Chem. Inf. Model 60, 2044–2058 (2020).Article 
CAS 
PubMed 

Google Scholar 
Spellmeyer, D. C., Wong, A. K., Bower, M. J. & Blaney, J. M. Conformational analysis using distance geometry methods. J. Mol. Graph. Model. 15, 18–36 (1997).Article 
CAS 
PubMed 

Google Scholar 
Kanal, I. Y., Keith, J. A. & Hutchison, G. R. A sobering assessment of small-molecule force field methods for low energy conformer predictions. Int. J. Quantum Chem. 118, e25512 (2018).Article 

Google Scholar 
Ernzerhof, M. & Scuseria, G. E. Assessment of the Perdew–Burke–Ernzerhof exchange-correlation functional. J. Chem. Phys. 110, 5029–5036 (1999).Article 
ADS 
CAS 

Google Scholar 
Lynch, B. J. & Truhlar, D. G. Robust and affordable multicoefficient methods for thermochemistry and thermochemical kinetics: the MCCM/3 suite and SAC/3. J. Phys. Chem. A 107, 3898–3906 (2003).Article 
CAS 

Google Scholar 
Reilly, A. M. & Tkatchenko, A. Understanding the role of vibrations, exact exchange, and many-body van der Waals interactions in the cohesive properties of molecular crystals. J. Chem. Phys 139, 024705 (2013).Article 
ADS 
PubMed 

Google Scholar 
Hoja, J. et al. Reliable and practical computational description of molecular crystal polymorphs. Sci. Adv. 5, eaau3338 (2019).Article 
ADS 
PubMed 
PubMed Central 

Google Scholar 
Góger, S., Medrano Sandonas, L., Müller, C. & Tkatchenko, A. Data-driven tailoring of molecular dipole polarizability and frontier orbital energies in chemical compound space. Phys. Chem. Chem. Phys. 25, 22211–22222 (2023).Article 
PubMed 
PubMed Central 

Google Scholar 
Medrano Sandonas, L. et al. “Freedom of design” in chemical compound space: towards rational in silico design of molecules with targeted quantum-mechanical properties. Chem. Sci. 14, 10702–10717 (2023).Article 
CAS 
PubMed 
PubMed Central 

Google Scholar 
Fallani, A., Medrano Sandonas, L. & Tkatchenko, A. Enabling inverse design in chemical compound space: Mapping quantum properties to structures for small organic molecules. ArXiv https://doi.org/10.48550/arXiv.2309.00506 (2023).

Hot Topics

Related Articles