Assessing the accuracy and efficacy of multiscale computational methods in predicting reaction mechanisms and kinetics of SN2 reactions and Claisen rearrangement

In the following, we embark on an exploration of the results obtained from our computational investigations. We begin by unraveling the intricate details of the mechanism underlying the SN2 reactions, shedding light on the activation energies extracted through the multiscale calculations. Next, we dissect the mechanism governing the Claisen rearrangement, analyzing the activation-free energy associated with this pivotal rearrangement process as unveiled by our multiscale computational calculations.Reaction mechanism of the studied SN2 reactionsIn the generally accepted mechanism of SN2 reactions depicted in Fig. 2, the hallmark feature is the five-coordinated carbon atom with an approximately 180° angle between the attacking nucleophile, the substrate’s carbon atom, and the leaving group in the TS structure. However, our results revealed that scanning the reaction path in vacuum did not reproduce this expected TS structure. Instead, it resulted in atypical angles and an incongruous structure. Notably, including the solvent implicitly using the CPCM model during both optimization and scan calculations did not significantly alter these observations. To successfully obtain the TS structure in vacuum, we had to constrain the NCI angle to ~ 180° during the scan process.Given that vacuum calculations (QM-only calculations) do not faithfully reproduce the SN2 reaction mechanism, we conducted these reactions explicitly accounting for the solvent environment using the multiscale methods. We employed variants of the QM/MM method, i.e. using different thicknesses of the MM-active region around the QM region and MM-fixed region around the MM-active region (i.e., 10.0/0.0, 7.5/2.5, 5.0/5.0, 2.5/7.5 Å; the value preceding the slash indicates the thickness of the MM-active region, while the value following the slash denotes the thickness of the MM-fixed region), QM1/QM2 calculations with QM2 region thicknesses of 4.5 and 7.5 Å around the QM1 region, and QM1/QM2/MM calculations with QM2, MM-active, and MM-fixed regions sized at 4.5, 3.0, and 2.5 Å, respectively (cf. Fig. 1).Following the multiscale calculations, we extracted geometrical data between heavy atoms (O–N, N–C, and C–I distances, and the NCI angle) of the reactant (Re), TS, and product (Pr) states for both SN2 reactions and collected them in Tables S1 and S2 in the Supporting Information. However, the average values of key geometrical parameters over the multiscale methods are presented in Fig. 2. Notably, the NCI angle approximates 180° in all methods, with longer N–C distances in TS compared to Pr and longer C–I distances in TS compared to the Re states. These results corroborate the proposed SN2 mechanism obtained through multiscale methods, a feat unachievable in vacuum calculations without constraints on the NCI angle.Further analysis of the geometrical data reveals subtle differences between the NH2OH and NH2O− reactions. Specifically, the N–C distance in the TS of the NH2OH reaction is shorter than the corresponding average distance in the NH2O− reaction (2.10 versus 2.30 Å), while the average C–I distance in the TS of the NH2OH reaction is longer than the corresponding value in the NH2O− reaction (2.39 versus 2.28 Å).It is worth noting that QM/MM calculations with a 2.5 Å radius for the MM-active region did not yield a product due to the confined space, preventing reactants from approaching for reaction, thus resulting in the exclusion of results from this variant in our dataset.The activation energy of the SN2 reactionsFollowing the discussion of the SN2 reaction mechanism in the previous section, we now delve into the results obtained from multiscale calculations concerning the activation energy of these reactions. We excluded the QM-only results and the energies from the CPCM calculations because they do not yield the expected TS structure without angle constraints. This examination aims to determine which method better replicates experimental activation energies. To this end, we expanded the multiscale calculations by employing mechanical embedding as single-point calculations on stationary structures obtained from the corresponding calculations with electrostatic embedding.Table 1 presents a comparison of the calculated activation energies from different theoretical methods with experimental values for these reactions.8 The table includes MAD, MADtr, MAXtr, relative range with respect to the experimental range (7.4 kcal/mol), ρ, and Kendall’s τ. All MSE values from multiscale calculations are positive and identical to the MAD values, indicating that all methods overestimate the activation energies. The lowest MAD values belong to QM/MM-10.0/0.0 with electrostatic and mechanical embedding (6.2 and 7.6 kcal/mol, respectively), with QM1/QM2-7.5 calculations ranking third and fourth (8.7 and 10.4 kcal/mol for electrostatic and mechanical embedding, respectively). MADtr and MAXtr values are consistent for the same type of method, with QM1/QM2-4.5 with electrostatic and mechanical embedding ranking first and second, respectively. These methods have the closest range to the experimental range after QM/MM-7.5/2.5-Elec (Elec and Mech after the method names refer to the electrostatic and mechanical embedding, respectively).
Table 1 Experimental and calculated activation energies (kcal/mol) of SN2 reactions, with MAD, MADtr, MAXtr, Range, ρ, and τ values for each method, along with method ranks in corresponding quality measures and overall rank. Exp denotes the experimental values. R1 and R2 denote the reactions of NH2OH and NH2O− with methyl iodide, respectively.All R2 values are 1.0, which is expected for comparisons involving only two points, so we have not included them in Table 1. Additionally, all ρ and τ values are either 1.0 or − 1.0. Positive ρ and τ values correctly predict that the activation energy of the first SN2 reaction is higher than that of the second, while negative values indicate incorrect predictions. Among the methods used, QM/MM-7.5/2.5-Elec, QM/MM-5.0/5.0-Elec, and QM1/QM2-7.5 have inaccurately predicted the order of activation energy magnitudes for the SN2 reactions. Overall ranks of the methods, based on MAD, MADtr, range, ρ, and τ ranks (R2 and MAXtr are not considered in determining the overall rank. This is because R2 is identical across all methods, and MAXtr corresponds directly with the MADtr values for each method category), indicate that QM1/QM2-4.5 with electrostatic and mechanical embedding holds the top total rank, with QM1/QM2/MM ranking third (cf. the last row in Table 1).From the results, conclusions can be drawn regarding the impact of the size of the MM-active and MM-fixed regions on the accuracy of computed activation energies. Among the employed QM/MM methods, QM/MM-10.0/0.0 exhibited the lowest MAD values. Notably, this variant with electrostatic embedding achieved the best overall rank (rank 3) among the QM/MM methods. This suggests that larger MM-active regions tend to yield activation energies closer to experimental values. Considering CPU time constraints, our results indicate that the CPU time for QM/MM calculations depends on the overall size of the MM-active and MM-fixed regions. Therefore, when the size of the QM region is fixed, it is advisable to choose a larger MM-active region, up to the hardware’s capacity, and avoid including the MM-fixed region in the model. However, should covalent bonds cross the boundary of the MM-active region, it is advisable to incorporate additional atoms into the MM-fixed region to satisfy the valency of the impacted atoms (this is available in ORCA as is discussed in section “Multiscale Implementation in ORCA”). To address this issue, we developed the pdbtoorca program, which ensures that no covalent bond is cut at the MM-active boundary if it traverses solvent molecules. It includes an option that, if only one atom of a residue is included within the defined distance from the QM system, all atoms of that residue are incorporated into the MM-active region. However, cutting a covalent bond is unavoidable if the MM-active region intersects with protein residues rather than the solvent medium.Mechanism of the Claisen rearrangementIn this section, we delve into the results obtained from studying the mechanism of the Claisen rearrangement. We aim to ascertain whether the QM-only and multiscale methods can accurately reproduce the generally accepted reaction mechanism of Claisen rearrangements, as depicted in Fig. 3—a task for which QM-only calculations previously faltered for SN2 reactions, except when a constraint was applied to the NCI angle. To achieve this, we utilized the QM-only method, QM/MM with MM-active and -fixed region sizes of 4.0 and 2.5 Å, respectively, QM1/QM2 with a QM2 region radius around the QM1 region of 4.5 Å, and QM1/QM2/MM with QM2, MM-active, and MM-fixed region radii of 2.0, 2.5, and 2.5 Å, respectively. All methods were executed with two variants—electrostatic or mechanical embedding—yielding a total of seven methods (one QM-only and six multiscale methods).Fig. 3 illustrates the mechanism of the Claisen rearrangement, wherein a single bond forms between C5 and C1, the double bond C1–C2 transforms into a single bond, the single bond C2–O becomes double, the C3–O bond is broken, the single bond C3–C4 transitions into a double bond, and the double bond C4–C5 converts into a single bond from the Re to the Pr state (the atoms are named on Fig. 3). Distances obtained from all multiscale methods closely aligned with those of the QM-only structures, with maximum MAD values of 0.29 Å (see the last rows of Tables S3-S8 in the Supporting Information). Average distances in the Re, TS, and Pr structures of the reaction are collated from all employed methods in Fig. 3. Additionally, Mayer bond orders for each bond are shown in the figure. Notably, the C3–O bond elongated from 1.45 to 2.12 Å from Re to TS, while the average C1–C5 bond length shortened from 2.29 to 1.55 Å from TS to Pr. These observations indicate that the O3–O bond is breaking and the C1–C5 bond is forming in the TS of the Claisen reaction, consistent with the generally accepted concerted mechanism of Claisen rearrangement. Calculated bond orders further support the proposed mechanism, with the C4–C5 bond order decreasing from 2.0 to 1.0, and the bond order of C2–O transitioning from 0.9 to 2.1 going from Re to Pr. In summary, both the QM-only and multiscale methods accurately reproduce the generally accepted mechanism for Claisen rearrangements.The activation free energy of the Claisen rearrangementFinally, we analyze which of the employed methods is more accurate in reproducing the experimental activation free energies of the Claisen rearrangement. Although no directly reported activation energy is available for the Claisen rearrangement, experimental rate constants for this reaction in water, methanol, and three different mixtures of these solvents were determined (0.79, 1.60, 4.60, 11.00, and 18.00 × 105 s−1 for the reaction in pure methanol (MET), 75% methanol (MET75%), 50% methanol (MET50%), 25% methanol (MET25%), and water (WAT), respectively9. We converted the rate constants to activation free energies using the general equation from transition state theory:$$\Delta G^{\ddag } \, = \,\, – \,RT{\text{ln}}(kh / (k_{{\text{B}}} \, * \,T))$$
(2)
where R is the gas constant, T is the temperature at which the experimental rate constants are measured (333 K), h is Planck’s constant, kB is Boltzmann’s constant, and k is the experimental rate constant of the Claisen rearrangement in different solvents.To expand the number of multiscale methods, we performed single-point calculations on the structures from the multiscale calculations with electrostatic embedding using mechanical embedding (Mech-Sp) and also optimizing the stationary structure along the reaction path with mechanical embedding (Mech-Opt), yielding three variants for each multiscale method and ten different methods overall. Computational activation free energies were calculated by adding zero-point energy (ZPE), thermal corrections, and entropy values to the corresponding multiscale energies. These values were obtained from frequency calculations performed on the optimized QM-only structures, as conducting frequency calculations on the entire QM/MM systems was not feasible due to hardware limitations..The CPCM calculations yielded the same ΔG‡ value for all solvents (25.6 kcal/mol), indicating that the CPCM method is not suitable for calculating ΔG‡ of the Claisen rearrangement. The failure of the CPCM method in predicting the correct ΔG‡ values could be attributed to the similar polarity and volume of the Re and TS states of the Claisen rearrangement, which are almost identical due to the simple positional changes of some bonds (cf. Fig. 3). Consequently, we excluded this method from our dataset and focused solely on the multiscale methods.Table 2 presents the experimental and calculated ΔG‡ values for the Claisen rearrangement along with the quality measures. The MAD values ranged between 5.0 and 11.0 kcal/mol, with the lowest MAD observed for QM1/QM2-Mech-Sp. Similar to the SN2 reactions, all multiscale methods produced positive MSE values, suggesting an overestimation of the ΔG‡ values.
Table 2 Experimental and calculated ΔG‡ values (kcal/mol) of the Claisen rearrangement, with MAD, MADtr, MAXtr, R2, Range, ρ, and τ10 (the τ10 values were calculated for all 10 pairs of the energies) values for each method, along with method ranks in corresponding quality measures and overall rank.The lowest MADtr and MAXtr values were recorded for QM/MM-Elec and QM/MM-Mech-Sp. While all R2 values were poor, the largest value belonged to QM/MM-Mech-Opt. Additionally, the lowest relative range was observed for QM/MM (Elec and Mech-Sp). Notably, the best ρ and τ10 (the τ10 values were calculated for all 10 pairs of the energies) values were obtained for QM1/QM2/MM. Overall, QM1/QM2/MM claimed the top rank (first place), with QM/MM-Elec securing the second rank.The purpose of performing single-point calculations with mechanical embedding on structures optimized with electrostatic embedding was to examine the impact of maintaining consistency in the embedding scheme during both optimization and single-point calculations on the computed activation free energies. As shown in Table 2, the ΔG‡ values change after optimizing the structures with mechanical embedding. Interestingly, Mech-Opt exhibits a better overall rank than Mech-Sp in the QM/MM and QM1/QM2/MM calculations, as demonstrated in the last row of Table 2. Surprisingly, the best overall rank is achieved by the QM1/QM2/MM calculations with Mech-Opt, where both the optimization of the stationary structures and energy calculations are carried out with mechanical embedding. These observations suggest that maintaining consistency in the embedding scheme during optimization and energy calculations can lead to improved results.Fig. 4 displays the efficacy of the five best methods: QM/MM-Elec, QM/MM-Mech-Sp, QM/MM-Mech-Opt, QM1/QM2-Mech-Sp, and QM1/QM2/MM-Mech-Opt. It is evident that all calculated absolute ΔG‡ values lie above the calibration line, indicating an overestimation of most values. The only exceptions are the ΔG‡ values of the reaction in pure water solvent from QM1/QM2-Mech-Sp and QM/MM-Mech-Opt (cf. Fig. 4a). From the methods, QM1/QM2-Mech-Sp yields values closest to the calibration line. Upon removing systematic errors (ranging from 1.3 to 11.0 kcal/mol), some points fall below the calibration line, resulting in a more symmetric distribution of calculated ΔG‡ values around the calibration line (cf. Fig. 4b). This implies that systematic errors are the primary source of deviation of the calculated values from the experimental ones. The precision of Eq. 2 could potentially contribute to these systematic errors, as it appears to underestimate experimental ΔG‡ values derived from the rate constants. After removing systematic errors, the ΔG‡ values of QM/MM-Mech-Sp and QM/MM-Mech-Opt become the closest to the calibration line.Figure 4Performance of multiscale methods in studying the Claisen rearrangement for (a) absolute ΔG‡ and (b) the energies adjusted for systematic error (MSE).

Hot Topics

Related Articles