AlphaFold predictions of fold-switched conformations are driven by structure memorization

AF samples both conformations of 35% of fold switchers likely in its training setAF’s ability to sample two folds assumed by single sequences was tested on 92 pairs of experimentally determined fold switchers. To our knowledge, these 92 pairs (Supplementary Data 1) include fold switchers from from many diverse fold families and source organisms16. These structural pairs are likely in AF2.3.1’s training set because they were all deposited in the PDB before 2022, and all of them were in the training set of OpenFold12,27, an AI-based model with the same architecture and performance as AF2. Most of them (85%) are also likely in AF3’s training set, based on the published methods5. All protein pairs have identical or nearly identical sequences and regions of distinct secondary and tertiary structure. AF predictions are defined as successful when they accurately capture both experimentally determined conformations, called Fold1 and Fold2. Prediction accuracy is assessed by calculating the TM-score28 between each AF prediction and both experimentally determined conformations. TM-scores quantify the similarity of topology and connections between secondary structure elements29, a reliable metric since fold-switching proteins are identified by secondary structure differences18. Because whole-protein TM-scores often overestimate the prediction accuracies of fold-switching regions, we assessed predictions using TM-scores of fold-switching regions only (Supplementary Fig. 1). Higher TM-scores indicate predictions closer to experimentally determined conformations. We ordered each pair of fold switchers so that Fold1 corresponds to the target conformation most frequently predicted by AF2, and Fold2 corresponds to the less frequently predicted target conformation (Methods: Defining Fold1 and Fold2). To augment this TM-score-based assessment, we also performed root-mean-square-deviation (RMSD) calculations of fold-switching regions and found similar results (Supplementary Fig. 2).First, four different AF2.3.1 modes and AF3 were tested on each fold-switching sequence: with templates, without templates, multimer model on single chains, and multimer model on protein complexes (Supplementary Data 2). AF2.3.1’s performance increased slightly above AF2.0’s (Fig. 1a), capturing 11/92 fold switchers (combining results both with and without templates) rather than 8/9216. Furthermore, AF2_multimer successfully predicted both conformations of 12/92 fold switchers. Surprisingly, AF3 underperformed relative to AF2, capturing both conformations of 7/92 fold switchers in total. Since AF3 was updated to model interactions between proteins and other biomolecules, we included as many binding partners as possible in the modeling: DNA, RNA, ions, and other ligands (Supplementary Data 3). Although the AF3 webserver is currently limited to a subset of biomolecules, 70% of the interactions in our dataset could be fully modeled with the ligands available (115/165).Fig. 1: AF predicts fold switching with modest success.a Numbers of successful fold-switch predictions for each AF2 method and AF3 compared with coevolutionary information found for both folds (ACE) and the total number of possible successes (dotted red line). All_AF2 combines all unique successful predictions from all AF2-based methods: >282,000 predictions. Predictions successfully made by more than one AF method are black; predictions unique to each method are gray. b Fraction of predicted structures that match experimentally determined conformations for all methods. Fold1 is the conformation most frequently sampled by AF2.3.1, Fold2 is the less frequently sampled (or unsampled) conformation. Conformations designated as Other are inconsistent with both experimentally determined structures. Source data are provided as a Source Data file.Although fold switching is often triggered by interactions with other proteins or biological molecules18, supplying this information to the Multimer model and AF3 yielded only nine unique fold-switch predictions, seven of which were predicted using single chains by other AF2.3.1 methods (Supplementary Data 2). Both the TM-score and RMSD-based assessments demonstrated that running AF2.3.1 with default inputs and parameters and default AF3 with appropriate interacting biomolecules infrequently produce successful fold switch predictions: 21% combined.We then tested whether AF2-based enhanced sampling approaches can predict more fold switchers than AF runs with standard inputs. Recently, two such approaches have been proposed to predict alternative conformations of proteins including fold switchers. The first, SPEACH_AF17, masks coevolutionary information in AF2’s input MSA by mutating selected columns to alanine in silico. Masking this information is expected to allow AF2 to identify coevolutionary signals in the MSA corresponding to alternative protein conformations, allowing it to sample a more diverse conformational ensemble. SPEACH_AF was tested on 16 different proteins and generated alternative conformations for almost all of them. Though none of these proteins were fold switchers, SPEACH_AF’s potential to predict fold switching was proposed17. The second approach, AF-cluster15, clusters sequences from a deep MSA by similarity and runs AF2 on individual clusters. This approach is based on the hypothesis that different MSA subsets may contain coevolutionary information distinct from deep MSAs, allowing AF2 to predict alternative protein conformations, though recent work suggests that AF-cluster may infer alternative conformations from its PDB training rather than coevolutionary inference30, limiting its robustness. Regardless, AF-cluster was tested on six families of fold-switching proteins and successfully predicted both conformations in three families15.To gauge how frequently SPEACH_AF and AF-cluster predict fold switching, we tested both approaches extensively on the set of 92 fold switchers tested previously, generating >77,000 structures with SPEACH_AF and >200,000 structures with AF-cluster (Supplementary Data 2). Both methods missed fold switching in most cases (Fig. 1a): 92% for SPEACH_AF (7/92 successes) and 80% for AF-cluster (18/92 successes).As mentioned previously, both SPEACH_AF and AF-cluster postulate that AF2 can predict alternative protein conformations when sufficient coevolutionary information is provided. A recent computational approach called Alternative Contact Enhancement (ACE) identified coevolutionary information unique to both folds of 56 fold-switching proteins, confirming that MSAs often contain structural information unique to both conformations19. Nevertheless, after combining all correctly predicted fold switch pairs from 282,000 predicted structures (Fig. 1b), AlphaFold2 misses this information in 35/56 cases. Thus, current enhanced sampling approaches typically do not enable AF2 to consistently detect the dual-fold coevolutionary information present in many MSAs of fold-switching proteins.AF2 confidence metrics select against alternative conformations of fold switchersThough AF2 often produces structural models with remarkably high accuracy1, its accuracy is reduced for fold-switching proteins when shallow MSA subsampling is used. We quantified the frequency of inaccurate predictions relative to correct predictions of Fold1 and Fold2 generated by all methods (Fig. 1b). In all cases, 30–49% of predictions did not correspond well to either experimentally determined structure.To see if AF2 could distinguish between good and inaccurate predictions, the relationship between prediction quality and AF2’s confidence metrics was assessed. AF2 estimates prediction quality with two confidence metrics: the per residue predicted Local Difference Distance Test (plDDT) and predicted template modeling (pTM) scores. We sought to determine whether either or both metrics discriminate between the good and poor fold-switch predictions generated by AlphaFold2 and AF-cluster. AF-cluster was selected because it predicted substantially more fold switchers than SPEACH_AF (18 rather than 7), generated fewer inaccurate predictions overall (~30% rather than 43%), and enabled a larger set of diverse predictions to be made.Neither of AF2’s confidence metrics successfully discriminated between good and inaccurate fold-switch predictions (Fig. 2a, Supplementary Figs. 3–5). Rather, both plDDT and pTM scores assigned lower confidences to diverse correctly predicted conformers and higher confidences to predictions that have not been observed experimentally. Thirty percent of all AF-cluster structures did not match experimentally determined structures of Fold1 or Fold2, making it the most accurate of all AF-based methods (Fig. 1b). However, of its highest ranked structures, the proportion of predictions inconsistent with experiment increased to nearly 70% (Fig. 2a, Supplementary Data 4, 5). A similar trend was observed for AF2.3.1 runs with standard settings (Supplementary Figs. 4–5). Interestingly, upon dividing targets into “Easy” and “Complex” based on the type and amount of conformational change, “Complex” targets were better represented in the “Top10” and “All” categories than “Easy” at all quality levels (Supplementary Data 6).Fig. 2: AF2 confidence metrics select against alternative conformations and do not predict the most energetically favorable fold-switch conformations.a Bar-plot representation of prediction success in Top1, Top10 and All fold-switch predictions indicate that more experimentally unobserved conformations are selected as prediction confidence increases. These trends are apparent in trendline plots showing the change in fraction of predictions as a function of prediction confidence. The leftmost 3 trendlines are from All predictions, the middle/rightmost are from Top10/Top1 most confident for each of 92 fold switchers. For each column of trendlines, the leftmost dot represents all conformations (not weighted by confidence), the next is predictions with medium confidence, then good confidence, and finally high confidence. Confidences are determined by ≥70% (medium), 80% (good), 90% (high) of residues with Cα plDDT scores ≥70. b AF2’s structure module predicts the lower energy conformations of fold switchers with better accuracy and higher confidence than higher energy conformations 50% of the time, equal to random chance. Blue dots represent correctly predicted ground state conformers with higher confidence; red dots represent correctly predicted excited state conformers with higher confidence than low energy, and gray dots have been observed to sample both folds at roughly equal proportions at equilibrium. Axes represent TM-scores of both conformations relative to experiment. Source data are provided as a Source Data file.These results strongly indicate that AF2’s confidence metrics select against experimentally consistent predictions of fold switchers, especially Fold2, in favor of experimentally inconsistent predictions. For instance, while AF-cluster correctly predicted 18/92 Fold2 conformations overall, only 7/92 were identified amongst high quality predictions (p < 8.1 × 10−4, one-sided binomial test). Further, significantly fewer correctly predicted conformations (either Fold1 or Fold2) were identified amongst high-quality models (37) than amongst all (53, p < 6.6 × 10−4, one-sided binomial test).Some of the experimentally unobserved conformations predicted by AF2 have been proposed to correspond to folding intermediates6. To the best of our knowledge, there is no experimental evidence supporting this claim for fold-switching proteins. In fact, a recently characterized folding intermediate of the transcriptional regulator RfaH suggests the opposite22. AF2-multimer predicted a hybrid α-helical/β-sheet fold with high confidence for its fold-switching C-terminal domain (Supplementary Fig. 6). This prediction is not consistent with experiment: most notably, the N-terminal portion of the AF2 prediction folds into a β-hairpin, while the experimentally observed intermediate has helical propensities in that region22. Thus, high confidence AF2 predictions that differ from experimentally determined structures do not necessarily correspond to folding intermediates, consistent with previous observations31.To further address if AF2-based enhanced sampling methods can predict folding intermediates, we compared models of essential mannosyltransferase PimA from M. tuberculosis with structures from accelerated molecular dynamics (MD) simulations consistent with 19F NMR experiments. These experiments revealed four functionally relevant states of PimA that coexist in dynamic equilibria21: two stable fold-switched states and two intermediates. Since only AF-cluster successfully predicted the both stable fold-switched states of PimA, we searched among its ~1400 models for structures resembling the two intermediates (Methods: PimA Intermediates). None were found. Among the models, 47% resembled Fold1 (active-compact state of PimA), 0.2% resembled Fold2 (inactive-compact or apo state), and the remaining ~53% of predictions did not resemble any of the four states, though many of these predictions (50%/53%) had low confidence (average plDDT <70).AF2’s inability to discriminate between good and poor predictions of fold switchers suggests that its confidence metrics may have broader limitations. To further assess this possibility, we used AF2’s structure module to energetically rank fold-switching protein pairs (Methods: AF2Rank). This approach correctly selected experimentally consistent structures among diverse models of 283 proteins8. Here, it correctly selected the ground state conformations of fold-switching proteins 50% of the time (Fig. 2b). In other words, the selective power of AF2’s structure module amounted to random guessing for fold-switching proteins. It may seem reasonable to hypothesize that this selective failure arises in cases where the ground states of fold switchers are oligomeric and the excited states are monomeric. This may not be the case, however, because AF2 predicts the folds of ground state oligomeric structures, such as KaiB, with the monomer model15. Furthermore, including oligomeric states and binding partners in the multimer model did not produce any unique fold-switch predictions (Supplementary Data 2); instead, all alternative conformations were predicted from monomeric sequences without the need for additional information about oligomeric state or binding partner. Thus, AF2 does not seem to require additional information about oligomeric state or protein binding partner to predict conformations of proteins in oligomeric assemblies or complexes. Providing additional biomolecular information to AF3 did not appreciably increase its predictive success either: out of its 7 successes, only 2 were not predicted by AF2.AF rarely predicts fold switchers outside of its training setAF’s modest success in sampling the conformations of fold switchers likely within its training set raises the question of how well it can predict fold switching of sequences without. After all, AF is most valuable when used to infer structural properties of uncharacterized proteins, such as conditionally folding regions of IDPs3 and yet-to-be-discovered folds4. Thus, we identified seven fold switchers with sequences outside of AF’s training sets and divided them into two categories: distant homologs of a known fold switcher and recently discovered fold switchers. The alternative conformations of all seven fold switchers were either (1) determined after AF2.3.1’s and AF3.0’s last training or (2) inferred by other experimental methods without depositing the alternative structure in the PDB.First, we assessed AF’s ability to predict fold switching of five distant homologs of the known fold-switching protein Escherichia coli RfaH32, a bacterial transcription factor whose C-terminal domain reversibly switches from an all α-helical ground state to an all β-sheet excited state upon binding RNA polymerase and a specific DNA sequence called ops33. Both conformations of E. coli RfaH have been determined experimentally34,35. Previous work provided circular dichroism (CD) and nuclear magnetic resonance (NMR) evidence for switching in all five of these sequence-diverse RfaH homologs32, all with sequences <35% identical to one another’s and to E. coli RfaH’s. As a control, AF’s ability to predict single folding was assessed in five additional experimentally characterized single-folding RfaH homologs whose CTDs were found to assume the β-sheet fold only (Supplementary Table 1).Although AF2, AF3, and AF-cluster correctly predict that E. coli RfaH–likely in their training sets–switches folds, none of them reliably predicted fold switching in the experimentally confirmed variants not deposited in the PDB. Specifically, AF2.3.1 and AF3.0 predicted a helical CTD in 1/5 cases with moderate confidence (Supplementary Fig. 7). In the other four cases, they predicted the β-sheet conformation only, as they did correctly for all single-folding controls. To extensively search for fold switching with AF-cluster, we generated 50 models per input MSA with 10 seeds for a total of 140,050 predictions of 10 proteins (Supplementary Data 7) both with and without dropout (>280,000 structures total), plus 5 models per input MSA with 2 seeds using both ColabFold1.3 and 1.5. Combining all predictions, AF-cluster predicted both folds for 4/5 conformations and only well-folded β-sheet conformers in the remaining case (Supplementary Fig. 8). However, all helical conformations were predicted with low confidence (average plDDT ≤50), indicating that AF2.3.1 can generate more confident helical CTD predictions than AF-cluster. This finding is consistent with the original AF2 paper’s observation that MSAs with ≥32 sequences are needed for reliable predictions1; AF-cluster-generated MSAs often have ≤10 sequences. Importantly, AF-cluster predicted low-confidence helical conformations in two single-folding RfaH homologs with CTDs experimentally confirmed to assume β-sheet folds rather than α-helices (Supplementary Fig. 8). NMR evidence from a previous study strongly suggests that the Candidatus Kryptonium thompsoni variant assumes the β-sheet conformation only32. Furthermore, the CD spectrum of the T. diversioriginum variant also suggests that it assumes a ground state β-sheet structure consistent with previously characterized RfaH variants whose CTDs do not assume helical conformations (Supplementary Fig. 9). Together, these results demonstrate that neither AF2, nor AF3, nor AF-cluster reliably predict fold switching of distant RfaH homologs, and AF-cluster predictions do not reliably distinguish between fold-switching and single-folding RfaH variants.Structures of the two remaining prediction targets were deposited into the PDB in 2023, after AF2.3.1 and AF3 were trained. Fold switching of Sa1–a 95 amino acid protein that reversibly interconverts between a 3-α-helix bundle and an α/β plait fold in response to temperature–was demonstrated by NMR spectroscopy36. We also included the structure of BCCIPα, a human protein whose sequence is 80% identical to its homolog BCCIPβ. Although BCCIPα has not been shown to switch folds, it assumes a structure completely different from BCCIPβ and has a different binding partner than its homolog37. Previous work has shown that when run with default parameters, AlphaFold2 fails to predict the unique structure of BCCIPα, whose most similar PDB analog differs by 9.9Å37. Thus, we included BCCIPα because (1) we wanted to see if AF-cluster or AF3 could produce its unique structure and (2) although BCCIPα might not switch folds, it tests AF2’s limits in predicting protein folds outside of its training set.AF2.3.1, AF3, and AF-cluster missed fold switching completely for both Sa1 and BCCIPα (Fig. 3). Specifically, 98.8% (2525/2555) of the Sa1 predictions assumed the α/β plait fold, and 54% (2022/3755) of the BCCIPα predictions assumed the structure of its PDB homolog, BCCIPβ. By contrast, AF2.3.1, AF3, and AF-cluster failed to predict both the 3-α-helix bundle conformation of Sa1 and the experimentally determined conformation of BCCIPα. BCCIPα’s structure was solved in complex with another protein37. Nevertheless, running AF2.3.1’s Multimer model and AF3 with BCCIPα’s binding partner still yielded the BCCIPβ structure (Supplementary Fig. 10). Because its apo structure has not yet been determined, it is possible that apo BCCIPα assumes the same structure as BCCIPβ, in which case AF2.3.1, AF3, and AF-cluster fail to predict its alternative conformation. It is also possible that apo BCCIPα assumes the same structure in its apo and bound forms, in which case AF2.3.1, AF3, and AF-cluster fail to predict its structure altogether38. These results cast doubt on the AF’s reliability and consistency in predicting the alternative conformations of fold switchers outside of its training set.Fig. 3: AF2 fails to predict fold switching of two protein structures outside of its training set.a Sa1 is a designed protein that switches reversibly between α/β-plait (PDBID:8e6y, Fold1) and 3α helix (PDBID: 2fs1, Fold2) folds triggered by temperature changes. Cartoon representations of Fold1 are colored blue for N-terminal residues (1 to 10), orange for the fold-switching residues (11 to 66 aligning with the amino acid sequence in Fold2, also in orange) and C-terminal residues (67 to 95) are red. Heatmaps of 50 predictions (M0 to M49) for each of 51 sequence clusters showing the similarity (TM-scores) to Fold1(left panel) and Fold2 (right) are presented below the cartoon representations of the two states. AF-cluster consistently predicts Fold1 but misses Fold2. b BCCIPβ and BCCIPα are human protein isoforms with 80% sequence identity that adopt distinct folds. (13 Å RMSD). AF-cluster was run on BCCIPα’s sequence. In the right panel, a cartoon representation of BCCIPα (colored blue to red from N-terminus to C-terminus) is shown with the heatmap of TM-scores of 50 predictions (model numbers M0 to M49) for each of 75 sequence clusters compared to the fold adopted by the α isoform (PDBID:8exf, chain B). In the left panel, the BCCIPβ experimental structure (PDBID:7kys) is shown with the heatmap of TM-scores compared to the fold adopted by the β isoform. AF-cluster frequently predicts the structure of the β-isoform but misses the experimentally consistent α-isoform structure.AF2 predictions are not always consistent with coevolutionary restraints and are better explained by memorization of training set structuresWhy does AF2 fail to predict alternative conformations outside of its training set? Two dominant explanations have been proposed. The first is insufficient information. AF2 has been proposed to use MSA-derived restraints as a starting point to minimize the energies of structures, much like NMR structure determination8. If AF2 works this way, its failure to predict a given conformation would result from improper restraints, i.e. the input MSA did not supply the information needed to specify the fold of interest. The second is structure “memorization”. In this case, AF2 does not always rely on coevolutionary restraints because it has “seen” certain folds during training and stored relevant structural information in its weights13, allowing it to associate learned structures with related sequences. The distinction between these two explanations is important. If AF2 predicts structures by energy minimizing structural restraints from MSAs, it can, in principle, predict any yet-to-be-discovered fold from its sequence given proper MSA input. By contrast, if AF2 relies on its training set to predict certain structures, it may be unable to correctly associate some sequences with their corresponding structures. This may explain its failure to predict the correct structure of BCCIPα and most RfaH variants, for instance.  It also suggests that both structures of Sa1 may be predicted if AF2 can be steered to associate its sequence with the homologous 3-α-helical bundle conformation in its training set.We applied our knowledge of AF2’s architecture to assess how it predicts alternative conformations (Supplementary Fig. 11). AF2 combines two modules to predict protein structure. The first is the Evoformer, which extracts evolutionary couplings from input MSAs and stores them as a pair representation, a tensor of real numbers used to predict distances between each amino acid pair in a protein chain. The pair representation and the target sequence are then passed to the Structure module, which maps these inputs to a three-dimensional structure. This predicted structure, along with the pair representation can be passed back into the AF2 network for further rounds of refinement, a process called recycling. Thus, before recycling, the pair representation is informed by the input MSA only. After recycling, the pair representation is updated with information both from the MSA and the protein model generated by the Structure module (Supplementary Fig. 11). Consequently, coevolutionary information that AF2 derives from an input MSA can be assessed most reliably at 0 recycles, since the MSA does not exclusively supply the information used to inform the pairwise representation after recycling.Leveraging this knowledge, we observed that ColabFold39 (CF)–an efficient-yet-accurate implementation of AF2–predicts structures of E. coli RfaH inconsistent with the restraints it infers from MSAs at each recycling step (Fig. 4). Specifically, by leveraging coevolutionary information from its input MSA at 0 recycles, CF predicts the active conformation of RfaH with a fully β-sheet C-terminal domain (CTD). Interestingly, at subsequent recycling steps, its CTD becomes increasingly helical, resembling the autoinhibited state. Since CF updates the input MSA at the beginning of each recycling step, this structural change could arise from updated MSA-based coevolutionary information updating the pair representation. This was not the case, however, when we inputted each updated MSA into CF with 0 recycles. Instead, CF predicted structures with fully β-sheet CTDs from all MSAs (Fig. 4). Thus, AF2’s MSA-derived pairwise restraints are inconsistent with those from its recycled structure predictions (Supplementary Fig. 12), indicating that the autoinhibited prediction of RfaH likely arises from something other than evolutionary restraints inferred from its input MSA.Fig. 4: AF2 structure predictions can be inconsistent with structural restraints from Evoformer.Although the full AF2 model predicts the autoinhibited form of RfaH (green helical structure, left panel) after 2 recycles (R2), the evolutionary restraints from Evoformer correspond to its active β-sheet form (blue β-sheet structures, right panel and Fig. S12) from each MSA inputted into the full AF2 model (left panel). The initial input MSA is depicted in the top lefthand corner with target sequence bold and colored black, blue, and yellow. Randomly subsampled MSAs inputted at each recycle are depicted in both panels, with identical MSAs being inputted at R0,1,2 and MSA_R0.0, MSA_R1.0, MSA_R2.0, respectively. The right and left panels differ by how AF2 makes predictions. In the right panel, restraints from input MSAs should inform the predictions because the input MSA is passed through AF2 (Evoformer and Structure Module) only once (0 recycles); this also applies to the R0 (0 recycles) step in the left panel. All structures based on these MSA restraints output structures with β-sheet CTDs (blue). The recycling steps in the left panel (R1 and R2) differ because they update the prediction with both previous MSA restraints and the previously predicted structures from the Structure Module. In these cases, the CTD becomes increasingly helical (green regions), indicating that the prediction changes during the recycling process.. Right and left panels are shaded to represent what information drives predictions: beige (recycling process, left) and light blue (Evoformer, right).Since the coevolutionary patterns that CF recognizes are inconsistent with its recycled prediction of autoinhibited RfaH, we sought to identify what drives this prediction. Previous work has suggested that predictions may sometimes be informed by structures “memorized” during training13. This seemed like a reasonable explanation for the inconsistencies we observed between evolutionary couplings and predicted structures (Supplementary Fig. 12).To test the possibility that the autoinhibited form of RfaH’s CTD may have been memorized during training, we inputted the single sequence of RfaH’s CTD into CF and examined its predictions after 0 recycles. This assessment focuses on what may have been “memorized” during PDB training since (1) the Evoformer cannot determine amino acid covariances from a single sequence and (2) 0 recycles affords an initial structural guess only from the target sequence, whereas recycling would allow deeper exploration of the AF2 network and may not suggest memorization. Out of the 25 RfaH CTD models generated, CF predicted that it forms a helical bundle 100% of the time (Supplementary Fig. 13a). These predictions contradict experimental observation: expressed in isolation, the RfaH CTD folds into a β-sheet structure, not a helical bundle40. This result again demonstrates AF2’s limited learning of protein energy landscapes. It also indicates that AF2 has likely memorized RfaH’s helical bundle conformation during training since predictions consistently resemble the helical structure likely in AF2’s training set. To probe for other cases of putative structural memorization, we performed single-sequence predictions on other fold-switching sequences and identified resulting models consistently resembling their corresponding PDB structures. These include the monomeric conformation of an archaeal Selecase and the β-sheet fold of NusG, a single-folding RfaH homolog (Supplementary Fig. 13b). Further, all AF-cluster KaiB predictions could be reproduced successfully using this approach (Supplementary Fig. 13c).AF3 misassigns evolutionary restraints of dimeric XCL1 while AF2 predicts it correctly by structure memorizationUp to now, the AF2-based methods developed to predict alternative protein conformations claim that their predictions are informed by coevolutionary information. By our benchmarking and a previous report15, none of them successfully predict the dimeric conformation of human XCL1, an immune system protein. Neither does AF3, which predicts an experimentally unobserved domain-swapped structure >16 Å from its experimentally observed counterpart (Fig. 5a) with average plDDTs of the ordered regions of the two top-scoring models >70. Interestingly, the coevolutionary patterns of the predicted dimer were nearly identical to those of XCL1’s monomeric conformation (Fig. 5b), which all tested implementations of AF2 and AF3 capture successfully. Although the predicted dimer structures and the experimentally determined monomeric conformation differ by >14.9 Å, this result suggests that AF3 predicted the experimentally unobserved dimer by mapping some of the evolutionary couplings corresponding to intrachain (monomer) interactions to interchain (dimer) interactions instead. Unfortunately, the evolutionary couplings corresponding to monomeric lymphotactin cannot inform the prediction of dimeric lymphotactin, whose intra- and interchain interactions differ (Supplementary Fig. 14).Fig. 5: AF predicts the dimeric form of XCL1 through structure memorization rather than coevolutionary inference.a Though AF3 was given appropriate stoichiometry and environmental conditions to predict the lymphotactin dimer, its prediction did not match experiment. b The coevolutionary patterns of AF3’s predicted XCL1 dimer match those of its monomeric conformation almost exactly. Contact maps generated from the AF3 prediction and experimentally determined monomeric conformation (2hdm). Upper diagonal corresponds to contacts unique to the AF3 prediction (light gray smaller dots correspond to intermolecular contacts; larger to intramolecular), lower diagonal corresponds to contacts unique to the experimentally determined monomeric conformation (black); common contacts medium gray; coevolutionary information inferred from MSA using ACE, (teal). c Both AF2 multimer and AF2 predict the correct XCL1 dimer structure from single sequences and 0 recycles, suggesting that they memorized its structure during training.Though none of the AF-based methods successfully inferred dimeric XCL1’s structure from evolutionary couplings, both AF2 and AF2-multimer successfully predict its structure from sequence alone (Fig. 5c). As before, both predictions were run with 0 recycles to assess whether the models associate the sequence with a structure “memorized” during training; both results suggest that AF2 predicts the dimeric form of XCL1 by sequence association rather than energy minimizing coevolutionary restraints. These results again suggest that some successful AF2 predictions are informed by structures learned during training–such as autoinhibited RfaH and the dimeric form of XCL1. Thus, we suspect that if these structures had not been in the training set, AF2 would not predict them.

Hot Topics

Related Articles