Structural and free energy landscape analysis for the discovery of antiviral compounds targeting the cap-binding domain of influenza polymerase PB2

Virtual screeningVirtual screening is a valuable computational technique for discovering bioactive molecules within extensive chemical libraries, aiming to identify those that can bind to specific biological targets. In our research, we employed virtual screening to pinpoint compounds from the Diverse lib database on the MTIopen web server. The selection process for these compounds was guided by their binding energy values obtained during the screening. We carried out a thorough analysis of 1500 compounds from the diverse lib Database on the MTIopen web server13. The binding energies of these molecules ranged from − 10.5 to − 7.8 kcal/mol (Supplementary Table S1). Based on their notable binding energies, four compounds—compound 1 (81066370), compound 2 (24334311), compound 3 (85303482) and compound 4 (85176812) as shown in Table 2 (Table 1).Table 1 List of 4 selected compound.For the safely and toxicity prediction of these compound we have also performed the ADMET analysis by using ADMETLAB 2.0 webserver. The ADMET (absorption, distribution, metabolism, excretion, and toxicity) analysis of compound 81066370 highlights its high lipophilicity (LogP: 4.248) and strong inhibition of P-glycoprotein, which can enhance drug retention in cells. It has high plasma protein binding (98.63%) and moderate blood–brain barrier permeability, suggesting potential central nervous system activity. The compound shows significant interactions with CYP ( cytochrome P450) enzymes, indicating robust metabolic stability. Despite its low solubility and absorption, it meets Lipinski’s rule of five and the Golden Triangle rule, demonstrating good drug-likeness. Compound 24334311 exhibits high lipophilicity (LogP: 3.989) and strong P-glycoprotein inhibition, with excellent blood–brain barrier permeability (BBB: 0.989) and significant interactions with CYP enzymes, ensuring metabolic stability. It has high plasma protein binding (94.40%) and a higher fraction of the drug remains unbound in plasma (10.66%), which is favorable for pharmacological activity. This compound also complies with Lipinski’s and the Golden Triangle rules, indicating good drug-likeness and potential therapeutic efficacy. Compound 85303482 shows high lipophilicity (LogP: 2.982) with minimal interaction as a P-glycoprotein inhibitor or substrate. It has high plasma protein binding (94.49%) and moderate blood–brain barrier permeability, which may be beneficial for central nervous system applications. The compound demonstrates robust metabolic stability through significant interactions with CYP enzymes. It meets Lipinski’s rule of five and the Golden Triangle rule, highlighting its potential as a viable drug candidate as ashown in Table S2.Redockin and Intermolecular AnalysisRe-docking is a crucial step in drug development, involving a rigorous reevaluation of the interactions between ligand molecules and receptor proteins. In our study, we employed specific parameters for the re-docking process: the docking grid’s center was set at coordinates (X = − 50.0, Y = 4.48, Z = − 3.14) with grid dimensions of 20 Å for each axis (X, Y, Z). This approach is vital for validating initial docking results and clarifying key interactions at the receptor’s binding site with the ligand. During the re-docking, each ligand was systematically aligned with the target protein and evaluated against a benchmark molecule, generating at least nine distinct poses per ligand-receptor pair. The pose with the most favorable docking energy, indicated by the lowest negative value, was chosen for in-depth analysis. This step was critical for further investigations into the stability of the complex and the affinity between the ligand and the target protein. Simultaneously, a virtual screening was conducted to explore a vast library of medicinal compounds. This screening highlighted four compounds with notably significant binding energies, recorded as − 10.4 kcal/mol, − 10.1 kcal/mol, − 9.9 kcal/mol, and − 9.8 kcal/mol. These compounds, referred to as compounds 1, 2, 3, and 4, showed promising potential and were selected for additional research. A control molecule, “93G”, with a binding energy of -7.8 kcal/mol, was used for comparative purposes. The three-dimensional structures of these molecules, including the control, were visualized using PyMOL software32, and two-dimensional figures were generated by Biodiscovery Studio33, enhancing our understanding of their spatial conformations and interactions within the binding site as illustrated in Fig. 1.Figure 1The 3D and 2D structure of the selected compound in complex with the protein (a, b) compound 1, (c, d) compound 2, (e, f) compound 3, (g, h) compound 4 and (i, j) control.As shown in Table 2, the current study involved a thorough examination of the molecular interactions that occurred between the selected compound and a target protein Compound 1 exhibited a singular hydrogen bond with Arginine 355 (Arg355) and engaged in eight hydrophobic interactions involving Phenylalanine 404 (Phe404), Arginine 332 (Arg332), Asparagine 429 (Asn429), Methionine 431 (Met431), Serine 324 (Ser324), Histidine 432 (His432), Serine 337 (Ser337), and Phenylalanine 363 (Phe363). Additionally, two π-π stacking interactions were identified with Phenylalanine 323 (Phe323) and Histidine 357 (His357). Compound 2 was formed eight hydrophobic interactions with Serine 337 (Ser337), Glutamate 361 (Glu361), Phenylalanine 363 (Phe363), Phenylalanine 404 (Phe404), Methionine 431 (Met431), Glutamate 341 (Glu341), Asparagine 429 (Asn429), and Arginine 332 (Arg332). This compound also formed two π–π stacking interactions with protein residue Phe323 and His357. Compound 3 formed seven hydrophobic interactions with Glutamate 361 (Glu361), Arginine 332 (Arg332), Serine 337 (Ser337), Phenylalanine 363 (Phe363), Phenylalanine 404 (Phe404), Asparagine 429 (Asn429), and Serine 324 (Ser324). Furthermore, three π-π stacking interactions were observed with Phenylalanine 323 (Phe323), Histidine 357 (His357), and Phenylalanine 325 (Phe325). Compound 4 exhibited one hydrogen bond with Histidine 357 (His357) and engaged in nine hydrophobic interactions with Glutamate 341 (Glu341), Lysine 353 (Lys353), Isoleucine 354 (Ile354), Glutamine 406 (Gln406), Glutamate 361 (Glu361), Lysine 376 (Lys376), Serine 337 (Ser337), Arginine 332 (Arg332), and Histidine 432 (His432). Additionally, a single π-π stacking interaction was noted with Phenylalanine 323 (Phe323). The control compound exhibited two hydrogen bonds with Serine 324 (Ser324) and Glutamate 361 (Glu361), alongside seven hydrophobic interactions with Histidine 432 (His432), Phenylalanine 325 (Phe325), Arginine 332 (Arg332), Serine 337 (Ser337), Phenylalanine 363 (Phe363), Lysine 376 (Lys376), and Phenylalanine 404 (Phe404), as well as a π-π stacking interaction.Table 2 Intermolecular interaction of four complexes in complex with the protein (a) compound 1, (b) compound 2, (c) compound 3, (d) compound 4 and (e) control compound.MD simulationMolecular dynamics (MD) simulations play a crucial role in elucidating the dynamic stability of protein–ligand complexes, offering detailed insights into temporal molecular interactions34,35,36,37. To advance our comprehension of the binding interactions and mechanistic behaviors of four selected compounds with a target protein, we conducted an extensive 500-ns MD simulation. A well-characterized control compound was used for comparative analysis purposes. Throughout the simulation, the trajectories of each molecule were meticulously recorded, enabling the observation of conformational changes during the binding process. Analysis of the simulation data allowed for the assessment of key interaction parameters, including binding affinity, conformational alterations within the protein, and shifts in the energy landscape that impact the stability and functionality of the protein–ligand complex. However, Compound 2 exhibited suboptimal performance in the MD simulation; therefore, subsequent studies focused solely on Compounds 1, 3, and 4, excluding Compound 2 from further analysis. This targeted approach facilitates a more refined investigation into the structural dynamics and interaction efficiency of the remaining compounds. The last pose and first pose before and after the MD simulation is illustrated in Fig. 2.Figure 2The 2D structure of the selected compound in complex with the protein (a, b) compound 1, (c, d) compound 2, (e, f) compound 3, (g, h) compound 4 and (i, j) control.RMSD analysisIn our molecular dynamics simulations, Compounds 1 and 3 exhibited the protein root-mean-square deviation (RMSD) values below 2 Å, indicative of stable protein conformations. Notably, Compound 3 exhibited a protein RMSD of 2.5 Å during the interval from 150 to 250 ns. In contrast, both Compound 4 and the control compound maintained protein RMSD values below 3 Å throughout the simulation period. Regarding ligand stability, Compound 1 exhibited a ligand RMSD of 2 Å until 210 ns, followed by an increase to less than 4 Å for the remainder of the simulation. Compound 3 showed a ligand RMSD of 3.5 Å, with minor fluctuations observable up to 200 ns, likely due to conformational changes within the ligand. Meanwhile, Compound 4 maintained a ligand RMSD of less than 4 Å throughout the entire 500 ns simulation. Similarly, the control complex exhibited consistent ligand RMSD values below 3 Å during the 500 ns simulation, suggesting a relatively stable interaction between the ligand and the protein over time. These findings provide crucial insights into the dynamic stability and conformational integrity of the ligand–protein interactions for each compound assessed as illustrated in Figs. 3 and 4.Figure 3RMSD analysis of protein in complex with three selected compounds protein (a) compound 1, (b) compound 3, (c) compound 4 and the (d) control.Figure 4RMSD analysis of three selected ligand (a) compound 1, (b) compound 3, (c) compound 4 and the (d) control in complex with protein.RMSF analysisProtein in complex with compound 1 exhibited notable stability in its interaction with the protein, as indicated by a root mean square fluctuation (RMSF) value of less than 3 angstroms (Å) across the amino acid residues numbered 90 to 110. Such low RMSF values suggest minimal deviation in the positions of these residues when bound with Compound 1, indicating a tight and stable interaction within this specific segment of the protein. While Protein in complex with compound 3 exhibited the protein RMSF value below 5 Å for the residues between 90 and 110. While this value is slightly higher than that observed for Protein in complex with compound 1, it still represents a relatively stable interaction within this region of the protein–ligand complex, albeit with slightly more flexibility or movement compared to the protein interaction in compound 1. On other hand protein in complex with compound 4 exhibited a different pattern where the protein RMSF exceeded 3 Å for residues 130 to 140. This indicates a higher degree of fluctuation or movement in this region, suggesting less stability in the protein structure when interacting with Compound 4. The increase in RMSF suggests that these residues experience more dynamic movements, which might affect the efficacy or stability of the compound-protein interaction. For the same residue range of 130 to 140, it was also noted that in control, the protein RMSF was less than 3 Å. The RMSF analysis of all the compounds is illustrated in Fig. 5.Figure 5RMSF analysis of protein in complex with three selected compounds (a) compound 1, (b) compound 3, (c) compound 4 and the (d) control.Radius of gyration analysisThe radius of gyration (RG) serves as a quantitative measure of the compactness of a protein’s tertiary structure and provides insight into its structural stability within biological systems38. An elevated RG value signifies a looser packing configuration, which may correlate with reduced stability due to increased vulnerability to environmental perturbations. Conversely, a lower RG is indicative of a tightly packed, more stable conformation. This relationship is visually elucidated in Figure S1, where varying RG values are correlated with distinct structural transformations in proteins, thereby underscoring their consequential effects on protein stability.Hydrogen bond analysisIn the initial stages of the simulation, Compound 1 formed one to two hydrogen bonds. However, as the simulation progressed, the stability of these hydrogen bonds varied, with bonding becoming confined to specific areas. This observation suggests a possible alteration in the molecular conformation of its binding sites. Compound 3 demonstrated a more stable yet minimal hydrogen bonding pattern. It started the simulation with a single hydrogen bond, which remained consistent throughout the duration of the simulation. The persistence of only one hydrogen bond suggests a rigid molecular structure or a constrained environment, potentially limiting further bonding opportunities. Compound 4 displayed a consistent hydrogen bonding pattern, maintaining one to two hydrogen bonds throughout the simulation. This consistent bonding indicates a balanced and flexible molecular configuration, which allows for stable interactions without significant changes in bonding patterns. This flexibility could be crucial for its functionality in specific environments or applications. The control compound exhibited a more complex and sustained hydrogen bonding pattern, consistently forming two to three hydrogen bonds up to the 500 ns mark of the simulation. This robust bonding behavior underscores a strong and stable interaction capability, with multiple bonding sites actively engaged over an extended period. This could reflect a complex molecular structure designed to engage more extensively with its surroundings, potentially making it a useful standard for comparison in studies of molecular interaction dynamics as illustrated in Fig. 6.Figure 6Hydrogen bond analysis of three compounds in complex with protein (a) compound 1, (b) compound 3, (c) compound 4 and the (d) control.Binding free energy analysisThe binding free energies of multiple complexes were quantitatively evaluated using the molecular mechanics/generalized Born surface area (MM/GBSA) approach. This technique facilitates the breakdown of the overall binding energy (ΔG) into its constituent energy components, thereby enabling a detailed analysis of the energetic contributions from the interactions between the ligand and the target protein, as delineated in Table 3. Specifically, Compound 1 exhibited a ΔG of − 48.46 ± 8.48 kcal/mol, while Compound 3 and Compound 4 exhibited ΔGs of − 52.95 ± 8.14 kcal/mol and − 53.30 ± 11.57 kcal/mol, respectively. The control complex showed a ΔG of − 52.14 ± 45.08 kcal/mol. Notably, the ΔG values of these complexes aligned closely with those of the control complex, indicating consistent and robust binding interactions with the target protein. This high degree of agreement with the control complex highlights the effective and stable binding dynamics of these compounds. Additionally, the small standard deviations associated with these values suggest a high level of reliability and reproducibility in the binding energy estimations. Collectively, these data suggest that the compounds under study exhibit a strong affinity for the target protein, classifying them as robust and effective ligands with the potential to modulate protein activity. These results emphasize the potential of these molecules as potent inhibitors within their respective biological pathways.Table 3 MMGBSA analysis of three compounds in complex with protein (a) compound 1, (b) compound 3, (c) compound 4 and the (d) control.RG-RMSD and PCA based free energy landscapeThe free energy landscape is a fundamental concept in molecular dynamics simulations, offering insights into the energy distribution within biomolecular systems and their conformational behaviour. This landscape is crucial for understanding the thermodynamic and kinetic properties that influence the actions of complex biomolecular structures. It outlines the energy barriers, stable states, and transition pathways among different molecular conformations. Gaining a deep knowledge of thermodynamics and kinetics over this landscape is vital for understanding the complex interactions, structural changes, and stability of biomolecular complexes. The radius of gyration (RG) analysis provides information on the protein folding process, indicating how much the protein compacts or expands as it folds. On the other hand, root-mean-square deviation (RMSD) analysis gives insights into the dynamic stability of the complex by monitoring deviations from a reference conformation over time. To illustrate the 2D (Fig. 7) and 3D (Figure S2) free energy landscapes, graphical representations were created using the geo-measure plugin in pymol, mapping RG against RMSD values. These visuals help in understanding the energy distribution and the structural transitions of the biomolecular complex. For the comparative study we have also performed the PCA based FEL through PCA. The PCA-based free energy landscapes in compound (a), (b), (c), and (d) provide insights into the system’s conformational stability. Compound (a) displays a distinct low Gibbs free energy basin, indicating highly stable conformations, while higher energy regions suggest less stable states. Compound (b) similarly highlights stable conformations but also reveals additional metastable states, suggesting the system can occupy several less stable configurations. Compound (c) extends the range of PC1, broadening the low energy basin and indicating greater conformational flexibility and overall stability. Compound (d) adjusts the scale for PC2, reinforcing the stability of the central basin and highlighting energy barriers around it, which represent transitions to less stable states. These landscapes collectively emphasize the stability of the system’s conformations and provide a detailed view of its thermodynamic properties as shown in Figure S6.Figure 7RG-RMSD based free energy landscape analysis of three compounds in complex with protein (a) compound 1, (b) compound 3, (c) compound 4 and the (d) control.We employed the methodical framework to analyze ligand-bound complexes along with a control group. The employment of graphical representations enhances the comprehension of energy consequences arising from conformational alterations within the chemical systems studied. Moreover, two-dimensional (Fig. 7) and three-dimensional (Figure S2) visualizations effectively illustrate the dynamic shifts in conformation occurring throughout the simulation process. This visual approach facilitated the identification of a low-energy conformation, unveiling an emergent structural form that provides essential insights into the energetically favorable states of the chemical system. The observation of a deep blue region within the extensive free energy landscape signifies the presence of localized energy minima, consistently observed as the protein structures transitioned into their lowest energy configurations. These localized minima, distinctly indicated by dark blue zones, elucidate the assembly of chemical group structures within minimal energy states. It was noted repeatedly that these complexes preserved localized energy minima within a broader free energy framework, as denoted by the dark blue areas. Further examination of the free energy landscape, in both two and three dimensions, unveils the thermodynamic properties of the studied chemical complexes, including a control compound. These analyses furnish comprehensive insights into their molecular dynamics and stability profiles. A notable characteristic of this landscape is the consistent presence of a relative maximal energy state among these complexes, sustained within an energy range of 14 to 16 kJ/mol. This uniformity across the complexes suggests a potential homogeneity in structural or functional traits, such as molecular interactions or stability. Additionally, all complexes demonstrated a stable conformation at energy levels below 2 kJ/mol, as depicted by the dark blue area on the landscape, corroborating their thermodynamic stability.In this study, we focused on extracting three distinct poses from each biomolecular complex identified within lower energy states, which are typically associated with more stable and energetically favorable conformations. These poses were superimposed on the initial pose of their respective complex, which served as a control to facilitate a direct comparison of structural deviations that occurred during the simulation as illustrated in the Fig. 8. The initial pose acts as a reference point, providing a baseline against which the structural integrity and deviations of subsequent poses can be assessed. To quantify these deviations, we calculated the root-mean-square deviation (RMSD) for each superimposed structure. RMSD is a statistical measure used extensively in structural biology to calculate the average distance between the backbone atoms of superimposed proteins. It is an indicator of the conformational differences between compared structures, providing insights into the degree of similarity or variance from the initial pose. The Compound 1 exhibited an RMSD of 1.179 Å, indicating minimal deviation from the control and suggesting high structural stability under the simulation conditions. Compound 3 had an RMSD of 1.364 Å, pointing to more significant conformational changes and suggesting a potentially more dynamic behavior within the simulation environment. Compound 4 showed the smallest RMSD of 1.067 Å, implying the greatest conformational stability among the compounds tested. Lastly, the control complex, which is crucial for comparative analysis, showed an RMSD of 1.206 Å. This value establishes a baseline for understanding the degree of structural fluctuations that might be considered normal within this specific experimental setup. These RMSD values are crucial in providing a clear, quantitative view of how each compound’s conformation varies from the initial structure, offering essential insights into the dynamic stability and conformational behaviours of the complexes under study. Through this analysis, we can better understand the structural dynamics that underpin the behaviour of these complexes in their respective lower energy states.Figure 8Superimposed representation of three selected compounds in complex with the target protein (a) compound 1 (b) compound 3 (c) compound 4 and (d) control.

Hot Topics

Related Articles