In silico studies of the open form of human tissue transglutaminase

Analysis of tTG available structuresFor computer simulations of the interaction between tissue transglutaminase and its inhibitors, a reliable and detailed model of the open conformation of the target protein is required. The high-resolution structures from the Protein Data Bank (for the open conformation of tTG: 2q3z, 3s3j, 3s3p, 3s3s) can serve as such a model28. We compared these structures to each other, paying special attention to the amino acid residues critical for the deamidation/transamidation process (Fig. 1).Figure 1Comparison of amino acid residues taking part in the catalyzation of deamidation and transamidation between different PDB structures. 2q3z is shown in green, 3s3j in cyan, 3s3p in magenta, 3s3s in orange; the ColabFold-predicted structure is shown in light gray.The positions of residues are close enough to each other between different high-resolution structures, and the protein backbone of tTG also remains consistent. This suggests limited mobility of the protein in this conformation, characterizing it as a convenient target for structure-based drug design. However, when comparing the known overall protein structures, significant differences can be noticed, which might influence the modeling results (Fig. 2). Since the structures obtained from PDB did not show the full protein’s integrity, the overall architecture of tTG protein differs as well as the active site structure, resulting in significant changes in the scenario of substrate-protein interaction for binding process simulations. The most relevant protein structure of the presented is from PDB:2q3z, where the loop forming the catalytic pocket has the most complete description. Even for this structure, part of the loop (residues E319-K327) forming the binding site is missing, which may hamper obtaining meaningful results of the structural based studies.Figure 2Comparison of the active site resolutions between different PDB structures. Ligand from 2q3z structure indicates the position of the active site and is shown in red. Significant differences are pointed at with cyan arrows. (A) tTG structure from 3s3s. (B) tTG structure from 3s3p. (C). tTG structure from 3s3j. (D) tTG structure from 2q3z. (E) The ColabFold-predicted tTG structure.We applied a homology modeling method to derive a unified open conformation of the tissue transglutaminase (tTG) protein29. We compared four open conformation structures of tTG from the PDB to select the most homologous structure for modeling using Modeller v. 10.4 software30. Based on the comparative analysis, the 2q3z structure was chosen for homology modeling, resulting in the prediction of the complete tTG protein structure. However, upon closer examination of the predicted active site, a reduction in the binding surface can be observed, which might result in poorer docking outcomes.An alternative method for modeling the full protein structure of tTG from its amino acid sequence is AlphaFold-based prediction. The closed conformation of the protein has already been successfully predicted using AlphaFold (ID: AF-P21980-F1). Modeling of tTG was done using the local version of ColabFold v.1.5.3 with the following parameters: num-recycle = 3, num-seeds = 10, tTG structure from PDB: 2q3z was used as a custom template, custom input msa comprising 36 sequences (including target sequence)31,32,33. MSA was performed in Unipro UGENE using Clustal Omega and was subsequently modified (amino acids matching from Q276 to C336 in the target sequence were changed to alanine to bias the prediction toward a desired conformational state)34,35,36,37.As shown in Fig. 1, all amino acid residues critical in deamidation and transamidation reactions of the predicted tTG structure align well with those of the available PDB structures (PDB file of the structure can be found in Supplementary Materials). Furthermore, the derived structure (Fig. 2E) is the most complete and can provide the most precise insight into the actual protein–ligand interaction. Ultimately, the ColabFold-predicted protein structure proves to be the best target for the in silico development of tissue transglutaminase inhibitors.The architecture of the tTG active siteA comprehension of the key features of the binding site of the open conformation of tTG is necessary to guide binding mode analysis and subsequent rational design (Fig. 3). The site predominantly resides on the protein surface and possesses a saddle-like shape, which poses challenges for virtual inhibitor design. Particularly, such architecture of the active site assumes the presence of a certain angle in the ligand core structure. For example, it can be reached by adding proline residue to the peptidomimetic inhibitors composition.Figure 3The active site of tTG. (A) Three cavities are schematically shown in green, blue and red spheres. Saddle-like structure is shown with the yellow dotted line. (B) Amino acid residues comprising cavities of the active site. (C) Comparison of cavities and amino acid residues. It is clear that the catalyzing cavity is shown as the blue sphere, with C277 residue inside of it.Based on the obtained structure, three cavities within the active site can be delineated, as referenced in Fig. 3A (marked schematically by coloured spheres). To achieve the surface area values significant to form reliable interactions with the active site, the ligand must occupy at least two of these three cavities. The biggest cavity (shown in red) is a wide pocket that can fit larger parts of the ligand such as aromatic groups. The cavity shown in green is located on the surface of tTG active site, and the cavity containing C277 residue (shown in blue) is a narrow deep pocket.Each of the accessible open-conformation protein structures in the Protein Data Bank (PDB) harbors an irreversible deamidated peptide-like inhibitor within its active site so that tTG stays in the open state. To elucidate the interactions between the amino acid residues of the active site and the ligand, close contacts (such as hydrogen bondings) and aromatic interactions (pi stacking, parallel displaced stacking, T-shaped stacking) were quantified in each of the four structures. The results of these calculations are shown in Fig. 4.Figure 4Interactions between ligands and the tTG active site. Close contacts are shown with black dotted lines, aromatic interactions are shown with orange dotted lines. (A) Ligand from 2q3z structure. (B) Ligand from 3s3j structure. (C) Ligand from 3s3p structure. (D) Ligand from 3s3s structure. (E) Number of close contacts and aromatic interactions calculated for each of the ligands.It is noteworthy that all compounds in the target-ligand complexes listed in the table exhibited similar characteristics (peptidomimetic structure containing 3 amino acid residues, size of compounds, ability to occupy all three of the tTG binding site cavities) both among themselves and when compared to the most successful commercial irreversible inhibitors of tissue transglutaminase (tTG). We also noticed that at the top of the binding site saddle, where the surface area of the ligand-site contact is lowered, all of the considered peptidomimetic inhibitors form two hydrogen bonds with N333 residue, compensating for the missing binding affinity and forming directed interactions. Finally, we observed the significance of an addition to the structure of the active site of the F320 residue, closing the gap existing in all open-state PDB structures of tTG.In order to assess binding affinity, molecular docking was conducted using the Gnina docking software38,39. The tTG structure obtained earlier with the AlphaFold-based algorithm was selected as the docking target. For validation of this target, we performed cross-docking of the ligand from 2q3z, 3s3j, 3s3p and 3s3s (Fig. 5). We optimized the docking outcomes by removing sulfur (which represented the thioester bond between C277 residue and substrate) from the active site of tTG. Additionally, flexible docking was done with amino acid residues W332, H335, W241, forming a cavity with catalyzing C277, and Q169, which forms another cavity (Fig. 3, in green), chosen based on the positions they could potentially occupy to enhance the binding affinity between the protein and ligand.Figure 5Molecular docking results in comparison to the original position of the ligand. The original position is shown in blue, the predicted position is shown in orange. (A). Cross-docking of 2q3z ligand to the tTG active site with the flexible Q169 residue. (B) Cross-docking of 3s3j ligand to the tTG active site with the flexible Q169 residue. (C) Cross-docking of 3s3p ligand to the tTG active site with the flexible Q169 residue. (D) Cross-docking of 3s3s ligand to the tTG active site with the flexible Q169 residue.During the docking process, several issues were identified. Removing the excess sulfur from the active site accurately predicted the position of ligand 2q3z in the active site; however, it did not affect the negative results of the other dockings. Flexible docking of amino acid residues W241 and H335 enabled them to block access to the cavity with C277, adversely affecting the docking outcomes. Conversely, flexible docking of amino acid residue Q169 enhanced binding energy and led to precise ligand cross-docking results. However, the docking is inaccurate while predicting 3s3s and 3s3p ligand positions. We assume that the reason behind such results is that the part of both ligands responsible for covalent inhibition of C277 residue is too long and therefore does not fit into the cavity. Another reason for inaccurate docking results can lie in the excessive flexibility of peptidomimetic ligands: for instance, the ligand from PDB:3s3s has 20 rotatable links which makes it hard to enumerate all possible conformations.In order to validate our assumptions, we performed 100 ns of molecular dynamics of our complete open-state tTG structure in apo mode and in complex with the ligand from 2q3z (position of which was precisely predicted by molecular docking) and with the ligand from 3s3s (docking position of which was less accurate) using Gromacs v.2022-645,46,47. Open conformation of the protein remains consistent (apo mode RMSD, Fig. S1) and both ligands retain their position in the active site during the 100 ns simulation (ligand RMSD, Fig. S1). The active site itself is partially rearranged to gain more structural stability of the complex, which can also explain the difference between experimental positions of ligands and positions obtained from docking simulations.Overall, the optimal molecular docking input parameters in the case of tissue transglutaminase have been estimated. Our ColabFold-predicted protein structure enables precise cross-docking of ligands, indicating its accuracy and the potential for structure-based drug design. Sulfur from the covalent binding of ligand and protein should be removed from the cavity, the flexible docking should be performed using Q169 as a flexible residue, and the exhaustiveness should be at least 32 to reach the most precise results.A library of new ligand-based and structure-based tTG-targeted small moleculesFor the purpose of discerning which ligand in the known chemical space might exhibit optimal binding while remaining distinctive, we obtained a list of chemical compounds from the ChEMBL database that have been previously evaluated for binding to tTG and filtered based on their IC50 values. In result, a compilation of 169 compounds with known structural formulas and an IC50 < 500 nM was obtained.In order to streamline the time-intensive process of selecting ligands, a clustering approach (Butina clustering) was employed based on the so-called Morgan circular fingerprints (presence of specific substructures) of radius 3 represented with 2048 bits, utilizing Tanimoto similarity measures40,41. We identified ten major clusters and chose one representative compound from each of the clusters (Table S1.2). These representatives were subsequently employed for docking procedures and searching for analogous molecules in the known chemical space.In our search for similar substructures within the ChemRar database, a library of 2066 molecules potentially inhibiting tTG was assembled (Table S1.3). To discern the most prevalent patterns in this compound selection, we generated Bemis-Murcko scaffolds (BMS) of the original 169 compounds in ChEMBL and in the ChemRar compound library using the Rdkit library toolkit in Python42.In order to determine which scaffolds were most commonly present in both datasets, we established an optimal threshold for a “sufficient” frequency of compound occurrences within a scaffold (set at 5 for ChEMBL compounds and 20 for ChemRar library compounds, respectively). This process yielded 6 scaffolds of experimentally verified compounds in ChEMBL and 12 scaffolds in the assembled library (Fig. 6). One scaffold was identical in both groups (scaffold 5 in the ChEMBL dataset and scaffold 11 in the ChemRar dataset), while the remaining scaffolds represented a different chemical space. The scaffold contains the sulfonic fragment providing the same angle for the core structure of small molecule ligands that is reached by proline residue in the peptidomimetic inhibitors.Figure 6(A) Most frequent scaffolds from the ChEMBL dataset of tTG ligands. (B) Most frequent scaffolds from the assembled tTG-targeted library. The scaffolds identical between two datasets are indicated by an asterisk.We performed molecular docking of ten representative compounds from ChEMBL compound clustering using Gnina software with the ColabFold-predicted protein structure and the flexible residue Q169 to enhance binding affinity and precision of the docking process. To compare docking binding free energy results with experimental data from ChEMBL, we used the formula that relates binding free energy to the dissociation constant:$$\Delta G = RTln\left( {\frac{{K_{d} }}{c}} \right), \quad c = 1 M$$
(1)
where dG is molar Gibbs free energy, Kd is dissociation constant, R is ideal gas constant, T is temperature and c is the standard reference concentration. All calculations are listed in Table S1.2. Though we observed no strong correlation between experimental ChEMBL data and modeling results, all obtained values differ by no more than 2 kcal/mol which can be regained by counting close contacts’ additional energy. The only compound with significant differences between experimental and modeling results is CHEMBL3423197, a compound with a relatively small size which explains lower energy of binding affinity.Finally, to assess binding efficacy and develop a series within the tTG targeted library, five diverse compounds were selected by visual inspection for each of the 12 frequently occurring scaffolds. Utilizing our prior knowledge of ligand docking in the open conformation of tTG, we modeled the binding of each selected compound to the ColabFold-predicted protein structure (Table S1.2).The results of the molecular docking of small molecules were compared in terms of binding free energy both amongst themselves and with previously docked peptidomimetic inhibitors from the described PDB structures (Fig. 7). Notably, the closest energy values with the least scattering were observed for compounds with scaffolds 3–5 and 10–12, indicating a comparatively higher accuracy in the prediction for compounds with a similar core structure. Furthermore, scaffolds 4, 5, and 12 appeared to be more favorable for binding with tTG, whereas scaffolds 10 and 11 are less advantageous. The scaffold 11 is the only scaffold present in both ChEMBL and our tTG-targeted libraries; therefore, novel selected compounds from the library exhibit higher binding affinity to the enzyme than the already known compounds.Figure 7(A) Binding free energy of compounds from tTG-targeted library obtained by docking, 5 compounds per scaffold. Binding free energy of peptidomimetic compounds from PDB tTG structures obtained by docking is shown in blue, binding free energy of ChEMBL compounds is shown in orange. Dots outside boxplot whiskers are considered as outliers. (B) Ligand efficiency of compounds from tTG-targeted library, 5 compounds per scaffold. Binding efficiency of peptidomimetic compounds from PDB tTG structures obtained by docking is shown in blue, binding efficiency of ChEMBL compounds is shown in orange. Dots outside boxplot whiskers are considered as outliers.Peptidomimetic inhibitors showed one of the most favorable energy levels even with an inaccurate prediction of the ligand position (Fig. 5C,D). However, in terms of ligand efficiency (binding free energy per heavy atom), peptidomimetic ligands do not expectedly demonstrate a favorable profile compared to other compound groups. The compounds with scaffolds 7, 8 and 10, on the other hand, exhibit the highest binding efficiency.In our analysis, we also identified most notable ligand outliers from the tTG targeted library (those deviating by more than 1 kcal/mol from other compounds in their group in terms of binding affinity, and more than 0.5 kcal/mol in terms of ligand efficiency). Compound with molecular scaffold 4, already one of the best groups of molecules in terms of affinity to tTG active site, shows exceptional binding free energy and therefore is of further research interest. Compound with molecular scaffold 9 has outstanding ligand efficiency, even though scaffold 9 itself is not one of the best ones in terms of affinity, and can be modified to the effective tTG inhibitor. Overall it can be stated that the promising scaffold-defined series of potential tTG inhibitors have been identified for further detailed analysis.

Hot Topics

Related Articles