DynaMight: estimating molecular motions with improved reconstruction from cryo-EM images

Description of conformational variabilityWe describe the ith of Nd particle images, yi, with the following forward model:$${y}_{i}={{{{\mathcal{C}}}}}_{i}* {{{{\bf{P}}}}}_{{\phi }_{i}\,}f({{{\varGamma }}}_{\!{z}_{i}}({{{\bf{x}}}})),$$
(1)
where \({{{{\mathcal{C}}}}}_{i} \ast\) denotes convolution with the contrast transfer function (CTF), \({{{{\bf{P}}}}}_{{\phi }_{i}}\) the projection of a particle that is rotated and shifted by its pose ϕi ∈ SE(3). We choose to represent the function f by a sum of Ng 3D Gaussian basis functions, or pseudo-atoms:$$f({{{\bf{x}}}})\approx \hat{f}({{{\bf{x}}}}):=\sum\limits_{j=1}^{{N}_{\mathrm{g}}}{a}_{j}{{{{\mathcal{G}}}}}_{{s}_{j}}({{{\bf{x}}}}-{c}_{j})$$
(2)
where \({{{{\mathcal{G}}}}}_{{s}_{j}}:{{\mathbb{R}}}^{3}\to {\mathbb{R}}\) is \({{{{\mathcal{G}}}}}_{{s}_{j}}({{{\bf{x}}}})=\exp \left(\parallel {{{\bf{x}}}}{\parallel }^{2}/{s}_{j}\right)\). Here, aj > 0 denote the amplitudes, sj > 0 the widths and cj the central positions of the Gaussian functions.We assume that all particle images are conformational variations of a single, consensus structure that is described by the Ng 3D Gaussian basis functions and zi in equation (1) is the conformational encoding for the ith image. We describe the deformation of individual particles as a deviation from the consensus coordinates x: Γ(x) = x − δ(x), so that:$$\begin{array}{ll}\hat{f}({{\varGamma }}({{{\bf{x}}}}))&=\sum\limits_{j=1}^{{N}_{\mathrm{g}}}{a}_{j}{{{{\mathcal{G}}}}}_{{s}_{j}}\big({{\varGamma }}({{{\bf{x}}}})-{c}_{j}\big)\\ &=\sum\limits_{j=1}^{{N}_{\mathrm{g}}}{a}_{j}{{{{\mathcal{G}}}}}_{{s}_{j}}\big({{{\bf{x}}}}-\delta ({{{\bf{x}}}})-{c}_{j}\big)\\ &\approx \sum\limits_{j=1}^{{N}_{\mathrm{g}}}{a}_{j}{{{{\mathcal{G}}}}}_{{s}_{j}}\left({{{\bf{x}}}}-\left({c}_{j}+\delta ({c}_{j})\right.\right),\end{array}$$
(3)
where the last approximation assumes that the deformation field is locally constant and that the density surrounding cj moves in a similar manner. This enables us to describe the deformations as displacements of the Gaussian centers, which is a computationally tractable representation. Furthermore, the widths sj and amplitudes aj of all Gaussian pseudo-atoms are kept the same for the entire dataset. This means that DynaMight is by design constrained to only model mass-conserving heterogeneity and cannot handle nonstoichiometric mixtures. Therefore, compositional heterogeneity should be removed from the dataset by alternative approaches before running DynaMight.Estimation of conformational variabilityFor learning the deformations, we use a VAE that consist of two neural networks, namely an encoder \({{{\mathcal{E}}}}\) that predicts an l-dimensional latent representation zi per particle image, and a decoder \({{{\mathcal{D}}}}\) that predicts the displacement of all Gaussian pseudo-atoms in the model. The encoder is a fully connected neural network with three linear layers and rectified linear unit activation functions. The input is a (real-space) experimental image yi and the output are two vectors \(({\mu }_{i},{\sigma }_{i})\in {{\mathbb{R}}}^{{N}_{l}}\times {{\mathbb{R}}}^{{N}_{l}}\), which describe the mean and standard deviation used to generate a sample zi that serves as input for the decoder.The decoder \({{{\mathcal{D}}}}({z}_{i},{c}_{j})\) then approximates the term cj + δj for each zi. We define the decoder for the entire set of Ng positions as:$${{{\mathcal{D}}}}({z}_{i},{{{{\bf{c}}}}}^{{{{\bf{0}}}}})={{{{\bf{c}}}}}^{{{{\bf{0}}}}}+{\delta }_{\theta }({z}_{i},{{{{\bf{c}}}}}^{{{{\bf{0}}}}})$$
(4)
In the above, c0 is all the consensus positions and δθ is a differentiable function, \({\delta }_{\theta }({z}_{i},{{{{\bf{c}}}}}^{{{{\bf{0}}}}})=[{\delta }_{\theta }({z}_{i},{c}_{1}),\ldots ,{\delta }_{\theta }({z}_{i},{c}_{{N}_{g}})]\), with parameters θ, that approximates δ for each position (Extended Data Fig. 1). In practice, we evaluate the decoder for each position cj and query δθ with a positional encoding of cj, concatenated with the latent representation zi that describes the conformation of each particle.The output positions are used to generate a projection image pi of the deformed model in the pose of the particle, and the difference with the experimental image \(\parallel {p}_{i}-{y}_{i}{\parallel }_{{{\Sigma }}}^{2}\) is minimized during training of the neural networks. Once trained, for a latent embedding of the whole dataset, one obtains a family of deformation fields \({{{\mathcal{D}}}}({z}_{i},\mathbf{x} )\approx {{{\varGamma }}}_{{z}_{i}}(\mathbf{x} )\) that is defined over the entire 3D space.Regularization and model biasBecause of high levels of experimental noise, cryo-EM reconstruction is an ill-posed problem. Even for standard, structurally homogeneous refinement, there are many possible rotational and translational assignments for each image. When estimating conformational variability, the poses are known, but many deformed density maps may explain each experimental image equally well. Therefore, in both cases regularization is essential for robust reconstruction.The most common form of regularization in VAEs is to constrain the distribution of latent variables to follow a Gaussian distribution, which lead to the model learning more meaningful and structured representations. The design of the decoder in Fig. 1 allows an additional form of regularization that imposes prior knowledge on its output of real-space deformation fields. A wide range of physically and biologically inspired penalties can be incorporated as priors on the deformations, also see refs. 12,14,15. Possibly a powerful source of prior information would come from an atomic model of the consensus structure, which could provide constraints on chemical bonds, maintain secondary structure elements and so on.To explore direct regularization of the deformation fields, we tested two approaches. The first approach aims to use prior information from an atomic model that is built in the consensus map, before running DynaMight. It generates a coarse-grained Gaussian representation of the atom positions, and then minimizes changes in the distances between these Gaussians according to the bonds that exist in the atomic model:$${{{\mathcal{R}}}}(E\,)=\sum\limits_{\{(i,\,j):{E}_{ij}=1\}}{\big| d({c}_{i},{c}_{j})-d({{{\mathcal{D}}}}({c}_{i},z),{{{\mathcal{D}}}}({c}_{j},z))\big|} ^{2},$$
(5)
where Ei,j = 1 if there is a bond between the two pseudo-atoms ci and cj and d denotes Euclidian distance. The deformations with this regularization scheme result in Gaussians that remain close to a coarse-grained representation of the original atomic model.The second regularization approach uses less prior information and does not require an atomic model. Instead, Gaussians are placed randomly to fill densities in the consensus map, and connections E in equation (5) are for all pairs of Gaussians that are within a distance of 1.5 times the average distance between all Gaussians and their two nearest neighbors. This regularization enforces overall smoothness in the deformations. Additional penalties that prevent Gaussians coming too close to each other, or moving too far away from other Gaussians, also exist to ensure a physically plausible distribution of Gaussians.Improved 3D reconstructionWe propose an algorithm that uses the estimated deformation fields Γ to obtain an improved reconstruction of the consensus structure that incorporates information from all experimental images. To map back individual particle images to a hypothetical consensus state, one needs to estimate the inverse deformations, which represents a challenge. Whereas the inverse deformation on the displaced Gaussians is given by the negative displacement vector, that is Γ−1(Γ(ci)) = ci, the inverse deformation field needs to be inferred at all Cartesian grid positions of the improved reconstruction. We train a neural network as a regression function to estimate a deformation field that coincides on the given sampling points Γ(ci), but can be evaluated on arbitrary positions. This network consists of an multilayer perceptron with six layers and a single additive residual connection to the original coordinates of the consensus model c0. Similar to the forward deformation model, the network takes the latent code zi and the deformed positions Γ(ci) as inputs and aims to output the original positions ci. In addition to the inversion of the forward fields on the sampling points, we force the inverse field to be smooth by adding a regularization term to the loss function.The algorithm aims to improve the reconstruction of the density f, using the known deformations Γ, that is we aim to find the minimizer \(\hat{f}\) of the data fidelity$${\hat{f}} = \mathop{{\rm{argmin}}}\limits_{f} \sum\limits_i\|{\mathcal{C}}_i(P_{{\mathrm{\Gamma}}_i}\,f\,) -y_i\|^2.$$
(6)
This minimizer can be computed using the reconstruction formula$$\begin{array}{rlr}\hat{f}&={\left[\sum\limits_{i}{P}_{{{{\Gamma }}}_{i}}^{* }\circ {{{{\mathcal{C}}}}}_{i}^{2}\circ {P}_{{{{\Gamma }}}_{i}}\right]}^{-1}\left[\sum\limits_{i}{P}_{{{{\Gamma }}}_{i}}^{* }({{{{\mathcal{C}}}}}_{i}^{* }{y}_{i})\right]&\\ &={D}^{-1}\left[\sum\limits_{i}{P}_{{{{\Gamma }}}_{i}}^{* }({{{{\mathcal{C}}}}}_{i}^{* }{y}_{i})\right],\end{array}$$
(7)
to get an estimate of the unknown density f. Here D is a matrix that depends on the estimated deformations, and \({P}_{{{{\Gamma }}}_{i}}^{* }\) is the composition of the backprojection operator and the inverse deformation corresponding to the ith particle (Fig. 1). For the structurally homogeneous case, Γ is the identity operator and D is diagonal in Fourier space and therefore the inverse can be computed simply by division, given that the distribution of projection directions covers the whole frequency domain and D has no zeros in the diagonal. In the presence of deformations, this matrix is not diagonal anymore and would be too expensive to compute or store. We approximate equation (7) by using the filter that would correspond to the homogeneous case, without deformations. Although even in the optimal scenario of having complete data of clean projection images, this method does not yield a minimum of functional in equation (6), it still allows to correct for the deformation to some degree. When the deformation fields are not smooth, for example when two nearby domains move in opposite directions, reconstruction with the proposed algorithm may introduce artifacts at the interface between the domains.Implementation detailsThe initial positions of the Gaussians for the VAE are obtained by approximating a map from a consensus refinement with a Gaussian model. This initial consensus map does not correspond to an actual state of the complex, but rather to a mixture of different conformations. Therefore, parts of the map will have regions of poorly defined density, and correspondingly fewer Gaussians. To overcome this limitation, we update the positions of the consensus Gaussian model throughout the estimation of the deformations, such that the positions cj may correspond to a single conformation at the end of the iterative process. We recommend using two Gaussians per residue, but a smaller number can be chosen if computational resources are limited or a low resolution estimation of the motion is required.After initialization of the Gaussians, in the first epochs of the training of the VAE, we only optimize the global Gaussian parameters, that is their widths, amplitudes and positions. These parameters are optimized with the ADAM optimizer and a learning rate of 0.0001. After this initial warm-up phase, we start optimization of the network parameters of the VAEs, again using the ADAM optimizer with a learning rate of 0.0001. During the second phase, the parameters of the Gaussians continue to be updated. Training of the VAEs is stopped when the updates of the consensus model do not yield improvements anymore or a fixed, user-defined number of epochs are completed.Training of the VAE is performed on two half sets, where two encoder–decoder pairs are trained independently, as illustrated in Fig. 1. This procedure yields two independent families of deformation fields, one for each half set. The approximate inverse of these deformations are then used by the deformed weighted backprojection algorithm to generate two independent maps with improved estimates for the consensus structure. These half-maps can then be used in conventional postprocessing and resolution estimation routines. As described in the ‘Discussion’ section, by setting aside a small validation set of images, the two independent decoders also allow an error estimation of the displacement fields.DynaMight has been implemented in pyTorch16, and is accessible as a separate job type from the RELION-5 graphical user interface. Because, as we will show below, the direct regularization of the deformation fields using atomic models may lead to overfitting, only the approach that enforces smoothness on the deformations, without the use of an atomic model, is exposed to the user on the graphical user interface. DynaMight uses the Napari viewer17 to visualize the distribution of particles in latent space, as well as the corresponding deformation fields. The same viewer also allows real-time generation of densities from points in latent space, movie generation, and the selection of particle subsets in latent space.Further implementation details are given in the Methods.Regularization can lead to model biasWe first analyzed the different options for regularization of the deformations on a well characterized dataset on the yeast Saccharomyces cerevisiae precatalytic B complex spliceosome18 EMPIAR-(10180, ref. 19). The same data, or subsets of it, have also been analyzed using multi-body refinement5 cryoDRGN9, Zernike3D13 and e2gmm10. To minimize computational costs and to ensure structural homogeneity9, we used 3D classification in RELION20 to select ~45,000 particles with reasonable density for the head region. Training of the VAEs on this subset with a box size of 320 took about 2.5 minutes per epoch on a single NVIDIA A100 GPU. This resulted in training times between 8 and 12 hours for estimating the deformations. Further estimation of the inverse deformations took ~4 hours and reconstruction with the deformed backprojection ~3 hours on the same GPU.Without any regularization of the deformations, estimated deformation fields displayed rapidly changing directions for neighboring Gaussians, and deformed backprojection yielded reconstructions for which the local resolution did not improve with respect to the original consensus reconstruction (Fig. 2a,b). A consensus reconstruction with better local resolutions was obtained using the regularization scheme that enforces smoothness in the deformations, but without using an atomic model (Fig. 2c). The map with the highest local resolutions was obtained using the regularization scheme that enforces distances between bonded atoms of an atomic model (Protein Data Bank (PDB) ID 5nrl) (Fig. 2d). It thus appeared that incorporation of prior knowledge from the atomic model into the VAE had been beneficial.Fig. 2: DynaMight reconstructions of the spliceosome subset.a, Standard RELION consensus refinement. b, DynaMight without regularization. c, DynaMight with smoothness regularization on the Gaussians. d, DynaMight with regularization from an atomic model. All maps are colored according to local resolution, as indicated by the color bar.However, because the neural networks in our approach comprise many parameters, we were worried that there would be scope for ‘Einstein-from-noise’ artifacts, similar to those described for orientational assignments in single-particle analysis21,22,23. To test this, we performed two control experiments.In the first control experiment, we replaced the atomic model of the U2 3′ domain/SF3a domain with a different protein domain of similar size (PDB 7YUY)24). The U2 3′ domain/SF3a showed only weak density in the consensus map, indicating large amounts of structural heterogeneity in this region. Although using the incorrect atomic model to estimate the deformation fields led to a similar improvement in local resolution compared to using the correct model (Fig. 3a,b), the reconstructed density from the deformed backprojection resembled the incorrect model, rather than the correct model (Fig. 3c and Supplementary Video 1).Fig. 3: Using incorrect atomic models in DynaMight.a, Reconstruction after deformed backprojection using the correct atomic model for the SF3a region, colored by local resolution (right). The correct atomic model for the SF3a region is shown in green on the top left; an overlay of that model with the reconstructed density after deformed backprojection is shown on the bottom left. b, As in a, but using an incorrect atomic model for the SF3a region (shown in red). c, Fourier shell correlation (FSC) curves between the maps in a or b, masked around the SF3a region, and the correct (green) or incorrect (red) atomic models. d–f, As in a–c, but using different atomic models for the SF3b region: reconstruction after deformed backprojection using the correct atomic model for the SF3b region (d), using an incorrect atomic model for the SF3b region (e) and FSC curves between maps in d or e, masked around the SF3b region, and the correct or incorrect atomic models (f).In the second control experiment, we replaced the atomic model of the SF3b domain with PDB 1G88 (ref. 25). The density for the SF3b domain in the consensus map was stronger than the density for the SF3a region, indicating that this region in the spliceosome is less flexible. In this case, using the incorrect atomic model yielded a map with lower local resolutions in the SF3b region than using the correct model (Fig. 3d,e). But still, the reconstructed density from the deformed backprojection resembled the incorrect model more than the correct model (Fig. 3f and Supplementary Video 2).These results indicate that estimation of deformation fields may lead to model bias, to the extent that reconstructed density may reproduce features of an incorrect atomic model. The scope for model bias to affect the deformed backprojection reconstruction is larger in regions of the map with higher levels of structural heterogeneity. Because it would be difficult to distinguish correct atomic models from incorrect ones, we caution against the use of this type of regularization in DynaMight. Therefore, in what follows, we only used the less informative, smoothness prior on the deformations. Using this prior, the deformations estimated by DynaMight are qualitatively similar to those observed for the same dataset using e2gmm10 (Extended Data Fig. 2 and Supplementary Video 3). For a different set EMPIAR-(10073, on the U4/U6.U5 tri-snRNP complex26), using the less informative smoothness prior in DynaMight led to an improved reconstruction with better map features and higher local-resolution estimates than reported for 3DFlex12 (Extended Data Fig. 3 and Supplementary Video 4), despite that 3D classification in RELION-5 selected a structurally homogeneous subset of only 86,624 particles, compared to 102,500 particles used for 3DFlex.DynaMight improves inner kinetochore mapsNext, we demonstrate the usefulness of DynaMight on two cryo-EM datasets of the yeast inner kinetochore27. Training of the VAEs took 17 and 27 hours on an NVIDIA A100 GPU for the two respective datasets described below, with particle box sizes of 320 and 360. Estimating the inverse deformations took ~6 hours for both datasets. The deformed reconstructions took 9 and 13 hours, respectively.The first dataset EMPIAR-(11910) comprises 100,311 particles of the monomeric constitutive centromere associated network complex bound to a CENP-A nucleosome (CCAN–CENP-A). For this dataset, we trained the half-set VAEs for 220 epochs and we used a ten-dimensional latent space. The estimated 3D deformations are distributed uniformly in latent space (Fig. 4a), without specifically clustered conformational states, suggesting that the motions in the dataset are mainly of a continuous nature. Analysis of the motions revealed that the nucleosome is rotating in different directions relative to the rest of the complex, and that these rotations coexist with the up and down bending of the Nkp1, Nkp2, CENP-Q and CENP-U subunits (arrows in Fig. 4b and Supplementary Video 5). The reconstruction from deformed backprojection improved local resolutions compared to the consensus map from standard RELION refinement, with clear improvements in the features for both protein and DNA (Fig. 4c,d and Extended Data Fig. 4).Fig. 4: DynaMight results for the CCAN–CENP-A complex.a, Principal components analysis (PCA) of the conformational latent space, with colored dots indicating the positions of the five maps in b. (Only the latent space for one of the two half sets is shown.) b, Five conformational states of the complex. One state, in red, is shown in all four panels. The colors of the five maps are the same as the colors of their corresponding dots in a. c, Reconstructions from standard RELION consensus refinement. d, The improved reconstruction using DynaMight. The maps in c and d are colored according to local resolution, as indicated by the color bar.The second data EMPIAR-(11890) comprises 108,672 particles of the complete yeast inner kinetochore complex assembled onto the CENP-A nucleosome. Training of the VAE was done for 290 epochs, and the dimensionality of the latent space was again set to ten. Again, a continuous distribution of deformations in latent space suggests continuous structural flexibility (Fig. 5a). Analysis of the deformations revealed large relative motions between different regions of the complex (root-mean-squared deviation and additional details are given in Supplementary Table 1). Different states of the complex are depicted in Fig. 5a and Supplementary Video 6. Deformed backprojection resulted in a map with improved local resolution and protein and DNA features compared to the map from consensus refinement (Fig. 5b,c and Extended Data Fig. 4).Fig. 5: DynaMight results for the complete kinetochore complex.a, PCA of the conformational space (on the left) with highlighted positions of five conformation states, the maps of which are shown in the same colors on the right. (Only the latent space for one of the two half sets is shown.) b, Maps from standard RELION consensus refinement. c, DynaMight reconstruction. d, Reconstruction using RELION multi-body refinement. The outline regions in the latter show the four bodies that were used for multi-body refinement. The maps in b–d are colored by local resolution, as indicated by the color bar.Because this complex, with a molecular weight of 1.5 MDa, is large enough to divide into multiple independently moving rigid bodies, we also applied multi-body refinement5 to this dataset. We used the four bodies illustrated in Fig. 5d; body 1 (orange): CCANTopo, body 2 (light green): \({\rm{CCAN}}^{{{{\rm{Non}}}}-{{{{\rm{topo}}}}}_{\Delta }{{{\rm{CENP}}}}-{{{\rm{I}}}}({{{\rm{Body}}}})}\), body 3 (yellow): \({\rm{CBF}}3^{{{{\rm{Core}}}}}\)+CENP-IBody and body 4 (dark green): CENP-ANuc). The local resolutions resulting from multi-body refinement (Fig. 5d) are better than those from the deformed backprojection reconstruction of DynaMight, illustrating that there is still room for further development of the latter. Nevertheless, the DynaMight map had better protein and nucleic acid features than a map obtained for the same dataset with 3DFlex, using default parameters12 (Extended Data Fig. 5). The DynaMight map also correlated better than the map from 3DFlex with atomic models that were built in the maps from multi-body refinement. Despite these observations, resolution estimates calculated from half-maps calculated by 3DFlex were higher than those calculated from half-maps by DynaMight. This suggests that using a single 3D deformation model in 3DFlex, rather than two separate models as done in DynaMight, could potentially result in over-estimations of local resolution.

Hot Topics

Related Articles