Single-molecule digital sizing of proteins in solution

Rationale for developing smMDSMicron-scale measurements of molecular diffusivity have proven to be a versatile and sensitive approach for probing the molecular sizes of proteins, their interactions, and assembly states in solution39,40,41. In particular, microfluidic diffusional sizing (MDS) has become an attractive, quantitative method for characterization of proteins and protein complexes under native solution conditions42,43,44,45,46,47,48,49,50,51,52,53,54. MDS exploits the unique features of laminar flow in the microfluidic regime and measures the diffusive mass transport of molecules across co-flowing sample and buffer streams within a microchannel41,42. By monitoring the diffusive spreading of analyte molecules at different downstream channel positions and analyzing the recorded diffusion profiles with advection–diffusion models, the sizes of analyte molecules can be quantified in terms of their hydrodynamic radii Rh. Importantly, MDS offers the advantage of being calibration-free, as it directly retrieves Rh from model fitting, thereby eliminating the need for external calibration standards. MDS can further probe the formation of biomolecular interactions and the assembly of analyte molecules into higher order structures by monitoring the increase in size associated with complex formation, and can retrieve binding affinities (i.e., dissociation constants KDs) through measurement of binding curves. However, despite the versatility of the approach, current implementations of MDS and other similar microfluidic methods are limited in their ability to resolve compositional heterogeneities. This is because they rely on ensemble readouts that average out the signal, making it difficult to determine the size distribution of different species, thus yielding only an average Rh for mixtures of differently sized species42,55,56. This limits information content, particularly when studying heterogenous, multicomponent systems. Additionally, detection sensitivities for MDS and other similar sizing techniques are in the nanomolar to micromolar range, which hampers ultrasensitive protein detection at concentrations in the often desirable pico- to femtomolar range42,55,56.A more sensitive detection method, like confocal fluorescence detection, could offer an effective solution to these challenges. This technique enables ultra-sensitive detection at the single-molecule level57,58,59, allows for direct digital readouts by counting individual proteins and protein complexes in solution60, and can be seamlessly integrated with microfluidic platforms60,61,62,63,64, thus setting the stage for developing a single-molecule digital sizing method. We therefore reasoned that combining single-molecule fluorescence detection with diffusivity measurements in microchannels would create a robust platform for sizing proteins at the single-molecule level in a calibration-free manner. Such a platform would be ideally suited for characterizing heterogeneous and multicomponent protein systems directly in solution.Working principle and experimental implementation of smMDSFigure 1a illustrates the working principle and experimental implementation of smMDS. smMDS measures the molecular diffusivity of analyte molecules within a microfluidic chip. It operates based on the principles of MDS42 and probes molecular diffusivity by flow-focusing an analyte stream between two auxiliary buffer streams within a microfluidic chip and then observing the diffusive spreading of analyte molecules to either side of the microfluidic channel as they travel downstream (see Methods). Because different positions along the channel correspond to different diffusion times, the tracking of the diffusive broadening of species at different channel positions allows calibration-free quantification of the diffusion coefficient D and, thus, extraction of the size of analyte molecules in terms of Rh42. Experimentally, smMDS measurements are conducted by introducing a sample containing fluorescent protein and buffer into the sizing chip and monitoring the micron-scale diffusive mass transport of molecules across the channel as they flow downstream the channel. Fluid flow in the channel is controlled by applying a negative pressure at the device outlet with a syringe pump (see Methods for details).Fig. 1: Working principle and experimental implementation of smMDS.a Schematic of the microfluidic chip design and integrated confocal scanning optics. The most relevant components are depicted. The dashed box highlights the scan region. The arrow indicates the scan trajectory across the four innermost channels. b Principle of continuous scan measurements. The confocal detection volume is moved at a constant speed across the microfluidic device, enabling the recording of diffusion profiles from direct intensity readouts. This mode enables recording of diffusion profiles under ensemble conditions. An exemplary diffusion profile from a continuous scan measurement of human serum albumin (HSA) at 100 nM is shown. Diffusion profiles are shown as blue lines, experimental fits as orange lines, and the local radius errors as green bands. Extracted hydrodynamic radii RH [with errors] are given as an inset. The local radius error is calculated as the difference between the hydrodynamic radius derived from the global fit and that obtained from the best matching profile at that specific position. The error range for RH is derived from the global fit, determined through a Taylor expansion of the least-square fit and through error propagation (see Supplementary Note 1 for details). c Principle of step scan measurements. The confocal detection volume is moved in a stepwise manner across the device, collecting data at defined positions with each step for a certain period of time in the form of time traces (see panel d). This mode enables detection of individual molecules and the creation of diffusion profiles from single-molecule digital counting. An exemplary diffusion profile from a step scan measurement of α-synuclein at 10 pM is shown. d A single-molecule time trace (lower panel) as obtained from a step scan measurement is shown. The time trace in the upper panel is a zoom-in view of the red shaded area in the lower panel. Red dots and highlighting indicate bursts detected by the burst-search algorithm. The bin time is 1 ms in all traces.Detection in smMDS is achieved using a high-sensitivity laser confocal fluorescence microscope functionality incorporated into the microfluidic platform (see Methods). By scanning the confocal volume across the microfluidic chip at the mid-height of the channel perpendicular to the flow direction (Fig. 1a), fluorescence from passing analyte molecules is recorded. The scan trajectory is chosen such that various positions along the channel are probed, including positions that are close to the nozzle where the sample stream meets the co-flowing buffer medium, and others, further away, downstream of the channel. In our implementation, the four innermost channels of the device are scanned to obtain diffusion profiles. These selected channels cover a wide range of distances and time points along the channel’s length. This enables the analysis of biomolecular analytes with Rh ranging from less than 1 nm to greater than 100 nm, paralleling the established range in standard MDS experiments42,47,65,66. The scan trajectory of the confocal volume in x,y,z-direction is set through two scan markers integrated within the microfluidic chip adjacent to the channels.Scanning is conducted in two modes, by either continuously moving the confocal volume through the chip (Fig. 1b), or by moving the observation volume along the same trajectory in a stepwise manner, collecting data at defined positions with each step (Fig. 1c). In continuous scan mode, diffusion profiles are rapidly acquired from direct intensity readouts. In this process, the confocal volume is moved through the device at a constant scan speed (tens of µm/s) and the fluorescence intensity from analyte sample flowing through the confocal volume is recorded. This allows swift recordings of diffusion profiles under ensemble conditions, that is, at concentrations where many molecules are present in the confocal volume (i.e., typically at concentrations greater than tens of pM). An exemplary diffusion profile obtained from a continuous scan experiment is shown in Fig. 1b. To extract RH, the recorded diffusion profiles are analyzed using an advection–diffusion model (see Methods). This process involves fitting the experimental profiles to a set of simulated profiles generated through numerical simulations. Using a least-squares error algorithm, the best-matching simulated profile is identified via fitting to extract D and retrieve RH using the Stokes-Einstein relationship (Fig. 1b).In step scan mode, diffusion profiles are generated from time trace recordings along the scan trajectory. An exemplary diffusion profile obtained from a step scan experiment is shown in Fig. 1c. In this process, the confocal volume is parked at various positions along the trajectory, and fluorescence signals from analyte flowing through the confocal volume are recorded for a certain period of time. Typically, between 200 to 400 scan steps across the chip are performed from start to end position and 2- to 4-second-long fluorescence traces are recorded at each position. Importantly, due to the high sensitivity of confocal detection, measurements in step scan mode enable the detection of individual molecules and, thus, the creation of diffusion profiles from single molecule counting (Fig. 1d). Hereby, bursts of fluorescence corresponding to the passage of single molecules through the confocal volume are recorded at each channel position. To estimate the number of molecules at each scanned position, a burst-analysis algorithm is employed (see Methods). This algorithm uses a combined maximum interphoton time (IPTmax) and minimum total number of photons (Nmin) threshold criterium to extract single-molecule events from the recorded time trace at each position (see Methods). This approach has been shown to enable effective discrimination between photons that originate from single fluorescent molecules and those that correspond to background, thus allowing individual molecules to be counted directly, that is, in a digital manner60. From the detected number of molecules at each position, diffusion profiles are then created by plotting the number of counted molecules as a function of chip position. Extraction of RH is done analogously as described above for continuous scan experiments by fitting the experimental diffusion profiles with our advection–diffusion analysis model. Figure 1c depicts fits and extracted RH values for the example data set.Sizing of proteins from bulk to single-molecule conditions by smMDSIn a first set of experiments, we sought to evaluate the sensitivity of smMDS and demonstrate its capability to determine protein size from bulk to single-molecule conditions. To this end, we labeled human serum albumin (HSA) with a fluorescent dye (Alexa 488) (see Methods and Supplementary Table 1) and performed concentration series measurements, both in continuous and step scanning mode, by varying the HSA concentration. The recorded diffusion profiles of the series are shown in Fig. 2b, c.Fig. 2: Sizing of proteins from bulk to single-molecule conditions by smMDS.a The sensitivity of smMDS and its capability to size proteins from bulk to single-molecule conditions was evaluated by measuring the size of human serum albumin (HSA) at varying protein concentrations. b Diffusion profiles for HSA as obtained from continuous scan measurements. From top to bottom: 1 µM, 10 nM, 1 nM, 20 pM HSA. Diffusion profiles are shown as blue lines, experimental fits as orange lines, and errors as green bands. Extracted hydrodynamic radii RH [with errors] are given as insets. For definitions of errors, please refer to the legend of Fig. 1b. Schematics on the left depict the decrease in concentration. c Diffusion profiles for HSA obtained from step scan measurements under single-molecule conditions. From top to bottom: 20 pM, 10 pM, 1 pM, 100 fM HSA. Diffusion profiles are shown as blue lines, experimental fits as orange lines, and errors as green bands. Extracted hydrodynamic radii RH [with errors] are given as insets. For definitions of errors, please refer to the legend of Fig. 1b. Schematics on the left depict the decrease in concentration. The two highlighted plots on the top are exemplary single-molecule time trajectories recorded at two channel positions, as indicated. Red dots and highlighting indicate bursts detected by the burst-search algorithm. d RH of HSA as obtained by continuous scan (orange points) and step scan (blue points) measurements. Data points (mean) were obtained from at least triplicate measurements at the respective sample concentration (see source data for number of repeats). Error bars denote standard deviations. The dashed line indicates the average literature value (mean) for HSA (RH = 3.73 ± 0.40 nm)67,68,69,70 with the green band depicting the standard deviation of literature values. Source data are provided as a Source Data file.In the range from 1 µM down to tens of pM of HSA (Fig. 2b), sufficient molecular flux of HSA protein molecules allowed for the recording of diffusion profiles from continuous scan experiments. The obtained profiles show the characteristic broadening due to diffusion of molecules along the channels. Narrow peaks are observed at channel positions close to the nozzle where the sample meets the carrier medium, and peaks broadened as we probed further downstream channel positions. Modeling of the diffusion profiles using our advection–diffusion analysis approach yielded excellent fits. Extracted RH values (Fig. 2d and Supplementary Table 2) were amongst all concentrations, within error, in excellent agreement with previously reported values for HSA67,68,69,70, demonstrating the robustness and accuracy of the approach. Of note, in some cases, the diffusion profiles display a minor offset from the channel centers. We attribute this to potential small imperfections upstream in the flow path, which can slightly shift the peak positioning. However, the global fitting procedure we utilize for extracting diffusion constants is resilient towards anomalous shifts that occur within channels, ensuring that the overall analysis remains robust and accurate.As we approached the single-molecule regime (i.e., at and below 20 pM) (Fig. 2c), we performed step scan measurements by moving the confocal volume in a stepwise manner across the channels. We observed bursts of fluorescence corresponding to the passage of single HSA molecules through the confocal volume (Fig. 2c, top panels). Using the burst-search algorithm (see Methods), we extracted single-molecule events from the recorded time traces at each position (Fig. 2c, top panels) and created diffusion profiles by plotting the number of counted molecules as a function of chip position (Fig. 2c). In this way, we obtained diffusion profiles from digital counting for HSA from 20 pM down to 100 fM. Remarkably, also in this regime, by applying our advection–diffusion analysis, we obtained excellent fits of the experimental data. We retrieved RH values that were in agreement with previously reported values for HSA67,68,69,70 and within an error margin of 10% (Fig. 2d and Supplementary Table 2). Of note, HSA exists in an equilibrium of monomers and low-order oligomers, which are populated with decreasing relative abundance32,71; hence RH values reported are weighted averages reflecting this distribution of the monomeric and oligomeric forms (for a deconvolution analysis see Fig. 5). Overall, the results shown here demonstrate that smMDS provides accurate size information over a broad range of concentrations and enables ultrasensitive sizing of proteins even in the picomolar to femtomolar concentration regime. We also evaluated the influence of parameter selection in the single-molecule regime on the extracted sizes (Supplementary Fig. 1) and found that a wide range of burst selection parameters, that is, varying thresholds for ITPmax and Nmin (see Methods), yielded expected size information, supporting the robustness of the approach.To compare the sensitivity of the smMDS technique to conventional MDS measurements, we conducted experiments utilizing fluorescence widefield imaging (Supplementary Fig. 2). We performed concentration series measurements by varying the HSA concentration starting at 1 µM of labeled HSA and then gradually decreasing the protein concentration down to 100 and 50 nM. Image analysis yielded a clear profile only for the measurements at 1 µM and 100 nM protein concentration, and the expected size for HSA could only be recovered for the measurement at 1 µM protein. Notably, the measurement at 50 nM HSA yielded a featureless profile that could not be fitted and, hence, no RH could be determined. This shows that conventional MDS experiments are limited to concentrations in the tens of nanomolar range. In comparison, the high sensitivity and digital detection capabilities afforded by smMDS allows measuring the size of proteins down to femtomolar concentrations, thereby extending the sensitivity range of diffusional sizing experiments by more than five orders of magnitude.Sizing of proteins and protein assemblies by smMDSNext, we sought to demonstrate the wide applicability of smMDS in determining the size of proteins and protein assemblies from single-molecule digital counting. We selected a varied set of analytes differing in size, including the proteins lysozyme, RNase A, α-synuclein, human leukocyte antigen (HLA), HSA, thyroglobulin, and oligomers formed by the protein α-synuclein. These protein analytes collectively span a size range of 1–10 nm. We also included the small organic fluorophore Alexa 488 in the series as a sub-nm sized analyte (Fig. 3a). The protein analytes were fluorescently labeled and purified before analysis (see Methods and Supplementary Table 1). We performed smMDS measurements at an analyte concentration of 10 pM and subjected the analytes to smMDS in step scan mode. We moved the confocal spot in a stepwise manner through the channels and extracted single-molecule events for each analyte at each channel position by digitally counting molecules to create diffusion profiles, which we fitted with our advection–diffusion model. Exemplary diffusion profiles for thyroglobulin, HLA, and Alexa 488 are shown in Fig. 3b. The obtained RH values from smMDS were then plotted against previously reported RH values for Alexa 48872, lysozyme73, RNase A74, α-synuclein75, human leukocyte antigen (HLA)53, HSA67,68,69,70, thyroglobulin76, and α-synuclein oligomers77 (Fig. 3a). The values obtained by smMDS followed the expected trend within error. This demonstrates the excellent agreement between sizes obtained from smMDS and literature values, highlighting the reliability of the single-molecule diffusivity measurements in size determination of protein analytes. In an additional analysis step, we plotted the experimentally obtained RH values against the molecular weight MW. Both, folded proteins (lysozyme, RNase A, HLA, HSA, thyroglobulin) and unfolded protein species (α-synuclein monomer and oligomers), followed the expected scaling behavior for globular (RH ∝ \({M}_{W}^{1/3}\)) and disordered (RH ∝ \({M}_{W}^{0.6}\)) proteins, respectively (Fig. 3a, inset).Fig. 3: Sizing of proteins and protein assemblies by smMDS.a Experimentally determined hydrodynamic radii RH for various protein species as obtained from smMDS plotted against literature values. The dashed line depicts the expected trend. Data points (mean) were obtained from at least triplicate measurements at 10 pM sample concentration (see source data for a number of repeats). Error bars denote standard deviations. Inset shows obtained RH (mean ± standard deviation) as a function of molecular weight MW. The dashed and dotted lines denote scaling behavior of globular (RH ∝ \({M}_{W}^{1/3}\)) and disordered (RH ∝ \({M}_{W}^{0.6}\)) proteins, respectively. b Exemplary diffusion profiles for thyroglobulin (blue), human leukocyte antigen (HLA) (violet), and Alexa 488 (cyan) at 10 pM sample concentration. Diffusion profiles are shown as blue lines, experimental fits as orange lines, and error as green bands. Extracted RH [with errors] are given as insets. For definitions of errors, please refer to the legend of Fig. 1b. Source data are provided as a Source Data file.Quantifying protein interactions by smMDSNext, we set out to demonstrate the capability of smMDS in determining the affinity of biomolecular interactions at the single-molecule level. Interactions of proteins with secondary biomolecules, in particular with other proteins, are of great importance across the biosciences, and quantitative measurements of affinity constants in the form of KDs have become vital in biomedical research and clinical diagnostics, for example, for histocompatibility testing and affinity profiling46,53,54,78. Diffusional sizing allows for the detection of biomolecular interactions by monitoring the increase in size associated with binding and complex formation42,48. By acquiring binding isotherms, affinity constants of the interaction can be determined in solution, without the need for purification or for immobilization on a surface. So far, diffusional sizing has been limited to the sizing and quantification of protein interactions at bulk nanomolar concentration levels—with smMDS, this barrier can be overcome.To demonstrate the detection of biomolecular interactions and quantification of binding affinities by smMDS in a digital manner, we probed the binding of a clinically relevant antibody–antigen interaction. Specifically, we investigated the binding interaction between HLA A*03:01, an isoform of the major histocompatibility complex type I (MHC) and a key factor in the human immune system79, and the antibody W6/32, an antibody that binds to all class I HLA molecules (Fig. 4a)80. We performed a series of step scan smMDS experiments by titrating HLA antigen (labeled with Alexa 488), at a constant concentration of 100 pM, with increasing amounts of the unlabeled W6/32 antibody. We opted for labeled HLA and unlabeled W6/32 antibody in our study, motivated by the clinical significance of detecting anti-HLA antibodies in patient serum, especially in scenarios involving organ transplantation53. Exemplary diffusion profiles for pure HLA at 100 pM, and 100 pM HLA titrated with 10 nM of W6/32 are shown in Fig. 4b. smMDS diffusion profiles, from three repeats, were acquired and fitted to obtain effective RH across the concentration series. We observed an increase in average hydrodynamic radius from RH = 3.18 ± 0.04 nm for pure HLA, corresponding to a molecular weight of 50 kDa, as expected for HLA, to RH = 5.08 ± 0.01 nm for the saturated complex, corresponding to a molecular weight of 215 kDa, consistent with the binding of a 150 kDa antibody to HLA (Fig. 4c). By fitting the binding isotherm (Fig. 4d), we determined the dissociation constant to be Kd = 400.5 ± 39.6 pM, consistent with previous results53. Importantly, HLA antibodies are an extensively used clinical biomarker to evaluate, for example, histocompatibility and the risk of allograft rejection53. Given that these antibodies are usually found in patient serum at very low concentrations, our findings offer a method for detecting and profiling antibody responses even when only minimal sample quantities are available.Fig. 4: Quantifying protein interactions by smMDS.a Sizing and affinity measurement of an antibody–antigen complex by smMDS. Shown is a schematic for the binding interaction between human leukocyte antigen (HLA) (A*03:01) and the HLA-antibody W6/32. b Exemplary diffusion profiles for the binding of the HLA-antibody W6/32 to HLA as obtained by step scan smMDS. The left panel shows a diffusion profile for 100 pM HLA. The right panel shows a diffusion profile for 100 pM HLA in the presence of 10 nM W6/32. Diffusion profiles are shown as blue lines, experimental fits as orange lines, and error as green bands. Extracted hydrodynamic radii (RH) [with errors] are given as insets. For definitions of errors, please refer to the legend of Fig. 1b. c Size increase upon complexation of HLA with W6/32 from RH = 3.18 ± 0.04 nm for pure HLA to RH = 5.08 ± 0.01 nm for the complex in the presence of 100 nM W6/32. Data points (mean) were obtained from triplicate measurements. Error bars denote standard deviations. Notably, for enhanced clarity and to prevent overlap, data points are randomly positioned along the x-axis in each chart. d Binding isotherm obtained from a titration of 100 pM HLA with increasing concentrations of W6/32. For analysis, the binding isotherm was fitted with a binding model assuming two antigen molecules binding one antibody53. The dissociation constant was found to be Kd = 400 ± 40 pM. Error bars are standard deviations from triplicate measurements. Source data are provided as a Source Data file.Overall, our results here highlight the potential of quantifying biomolecular interactions through single-molecule digital measurements, even at very low concentrations. We note that the possibility to measure at very low protein concentrations enables examining high-affinity interactions (i.e., with sub-nanomolar affinities). In these cases, the binding curves are shifted significantly to the left, necessitating an approach capable of resolving binding curves at lower concentration ranges. Traditional bulk assays typically lack the sensitivity for such low concentrations. Step scan smMDS, by contrast, could provide this resolution to delineate high-affinity interactions. Furthermore, it is noteworthy that the single-molecule binding assay demonstrated here could be applied to other antibody–antigen systems (e.g., profiling of SARS-CoV-2 antibody interactions)46,50,54 and the labeling scheme, with labeled antigen and an unlabeled antibody, could be customized to suit the specific diagnostic requirements of the system being investigated. For example, if the detection of low concentrations of antigen is more relevant in a certain diagnostic context, the roles can be reversed, with the antibody being labeled and the antigen remaining unlabeled42,48,81.Resolving protein oligomeric states by smMDSMany proteins fulfill their biological roles not as monomeric species but as oligomeric assemblies, which often exhibit significant heterogeneity in terms of their degree of oligomerization and relative abundance82,83,84. Oligomeric forms of proteins play important functional roles in cellular physiology but are also implicated in diseases such as neurodegeneration85,86,87. Resolving the degree of oligomerization and thus the size of such heterogenous protein populations is, however, challenging with currently available biophysical techniques. A key feature of smMDS is that it has the capability to directly distinguish between various assembly states of a protein based on a difference in their emitted fluorescence signals. This feature is afforded by the single-molecule sensitivity of smMDS and enables the creation of diffusion profiles from subspecies that make up the heterogeneous population. To demonstrate this capability, we set out experiments with two protein oligomer systems that are inherently heterogeneous and have distinct functions in biology and disease.In a first set of experiments, we determined the sizes of low-molecular weight oligomers formed by the protein HSA (Fig. 5). Serum albumins are known to exist in an equilibrium of monomers, dimers, trimers, and tetramers, which are populated with decreasing relative abundance32,71. At the single-molecule level, such different oligomeric states can be discriminated through brightness analysis of fluorescence bursts24,25. In this analysis, different species can be distinguished based on their emitted fluorescence intensity because the magnitude of the observed intensity scales directly with the number of individually dye-labeled monomer units present in an oligomeric assembly. By applying differential thresholding, oligomeric states can then be discriminated, which provides an opportunity for smMDS to create distinct diffusion profiles for each oligomeric subspecies, enabling their independent species-specific sizing. To demonstrate this, we set out smMDS measurements to resolve the sizes of HSA monomers, dimers, trimers, and tetramers. We subjected labeled HSA to smMDS at 10 pM protein concentration and performed step scan measurements to extract single-molecule events from single-molecule time trace recordings. We then displayed the extracted normalized burst intensities from all recorded burst events of the measurement in a burst intensity histogram (Fig. 5a). This allowed us to display single-molecule burst events according to their brightness and assign regions of intensity for the monomeric and the different oligomeric HSA species. Accordingly, the main peak in the histogram reflects the average intensity of monomeric HSA. We extracted this intensity by fitting the distribution with a skew-normal distribution function, which reflects the skewedness of the burst intensity distribution due to the undersampling effects at short burst times, and retrieved a mean intensity for the monomer of Imonomer = 75.33 photons/ms with a standard deviation σmonomer of 37.44 photons/ms. Since oligomers contain as many fluorophores as monomer units, dimeric, trimeric, and tetrameric forms of HSA emit at multiples of the normalized intensity of monomeric HSA, due to the increasing number of fluorophore present within the assembled states. We therefore defined regions at two-, three-, and four-fold of the normalized intensity of the monomer, corresponding to HSA dimer, trimer, and tetramer, respectively by fitting the burst intensity distribution with three Gaussian functions. The widths of these oligomer regions were assumed to have the same standard deviation as the monomeric protein. The resulting fit for all Gaussians, including the skew-normal distribution for the monomer, described well the experimental burst intensity distribution. Notably, the four-species model is validated by statistical analysis, as shown in Supplementary Fig. 3, indicating that a four-species model aligns best with the data (see Supplementary Note 2 for more information). We then generated diffusion profiles from the bursts within each of the four regions and fitted the profiles to extract size information from the respective monomer/oligomer range (Fig. 5b). The extracted sizes of the four different regions correspond to molecular weights of proteins of 65 (44–99) kDa, 145 (99–214) kDa, 266 (192–370) kDa, and 298 (192–790) kDa, respectively, and thus are, within error, in very good agreement with the sizes of monomeric, dimeric, trimeric, and tetrameric HSA (66 kDa, 144 kDa, 199 kDa, 266 kDa, respectively). In addition to subspecies-resolved sizing of oligomers, our analysis can also provide information on the relative abundance of oligomeric species. The areas under the curves, as obtained from Gaussian fitting of the burst intensity histogram, reflect the abundance of oligomeric species. We obtained relative abundances of 67.9% for the monomer, 20.7% for the dimer, 7.2% for the trimer, 4.2% for the tetramer (Fig. 5c). These values are in broad agreement with the equilibrium distribution of other serum albumins. For example, for bovine serum albumin, relative abundances of 88.63%, 9.94%, 1.18% and 0.25% for monomers, dimers, trimers, and tetramers were previously found32. Overall, our analysis here shows that smMDS can afford species-resolved insights into oligomer size and abundance.Fig. 5: Resolving protein oligomeric states by smMDS.a Sizing of low-molecular weight oligomers formed by the protein human serum albumin (HSA). Shown is a burst intensity histogram, which displays all bursts extracted from a single smMDS step scan measurement of HSA (10 pM). Intensities are normalized intensities with respect to burst duration. Four regions (blue, orange, green, and red), which correspond to HSA monomer and HSA dimer, trimer, and tetramer, are defined in the burst intensity histogram. These were obtained by fitting the distribution with a skew-normal distribution function for monomeric HSA and three Gaussian functions for dimeric, trimeric, and tetrameric HSA (see main text for details). The center positions for the oligomers (dimer: 150.66 photons/ms, trimer: 225.99 photons/ms, and tetramer: 301.32 photons/ms) are multiples of the normalized intensity of the monomer (Imonomer = 75.33 photons/ms). The widths of the regions reflect one standard deviation of the distribution of monomeric HSA (σmonomer = 37.44 photons/ms). b Diffusion profiles for HSA monomer, dimer, trimer and tetramer generated from bursts within each of the four regions in the burst intensity histogram shown in panel a (panels are color-coded according to the colors used in panel a). Each profile was fitted to extract size information. Diffusion profiles are shown as blue lines, experimental fits as orange lines, and error as green bands. Extracted hydrodynamic radii RH [with errors] are given as insets. For definitions of errors, please refer to the legend of Fig. 1b. c Species-resolved RH (left panel) and abundance of HSA oligomers (right panel). RH were obtained from diffusion profile fits shown in panel b and represent the best-fit values; error bars correspond to the error range of RH as derived from the global fit. For definitions of errors, please refer to the legend of Fig. 1b. Abundance was obtained from skew normal/Gaussian fitting in panel a and represents the best-fit value; the error bars correspond to the 99% confidence intervals obtained from bootstrapping. Source data are provided as a Source Data file.In a further set of experiments, we analyzed a heterogenous mixture of α-synuclein oligomers (Fig. 6). Oligomeric forms of the protein α-synuclein are considered to be central to the pathology of Parkinson’s disease and hallmarked by a high degree of heterogeneity in terms of size and structure88,89,90. Their characterization is an area of intense interest, not least because such information is useful in drug development activities, however, tools that can directly resolve the heterogeneity of these nanoscale assemblies in solution are scarce77,91,92,93,94,95. To address this challenge and characterize the structural heterogeneity of α-synuclein oligomers, we analyzed a heterogenous mixture of α-synuclein oligomers by smMDS. We injected oligomers produced by lyophilization of Alexa 488-labeled α-synuclein into the microfluidic sizing chip at a concentration of 10 pM and performed step scan measurements to digitally extract single-molecule events of passing α-synuclein oligomer molecules at each channel position. To create diffusion profiles from subspecies, we selected bursts with different fluorescence intensities to resolve differently sized assembly states of oligomers within the mixture. Here we took an alternative approach as compared to the analysis of HSA oligomers presented above and extracted bursts by varying the minimum number of fluorescence photons in the burst search algorithm, while keeping the inter-photon time threshold constant. This allowed us to effectively differentiate between single-molecule burst events that differ in their molecular brightness (see burst intensity histograms shown in Fig. 6a). In this way, diffusion profiles from assemblies that differ in their fluorescence intensity and, hence, size in terms of RH were generated. Exemplary diffusion profiles for four different thresholds are shown in Fig. 6b. These profiles were then fitted with our advection–diffusion model to extract size information. We applied this analysis to a range of photon thresholds, specifically between 5 and 50 total photons, to ensure that the chosen thresholds were representative of the intensity variations observed in oligomer burst signals. We then generated a plot of the extracted sizes versus photon thresholds from the diffusion profile series (Fig. 6c). The measured sizes span from RH = 3.6 nm for the smallest photon threshold value to RH = 16.5 nm for the largest threshold value. The value at the smallest threshold reflects the size of α-synuclein monomers (RH = 3.1 ± 0.4 nm) (c.f. Figure 3), while higher values reflect α-synuclein oligomer subpopulations, spanning a range of RH values greater than 10 nm. The RH-value distribution obtained by smMDS is in very good agreement with the size ranges reported in earlier studies, as determined by techniques such as atomic force microscopy (AFM)91,96,97,98, transmission electron microscopy (TEM)99,100,101,102,103, small-angle X-ray scattering (SAXS)92,99,104,105,106, and dynamic light scattering (DLS)104,107. In comparison, we also generated a diffusion profile from the entire set of bursts detected at each position, which through fitting yielded an ensemble-averaged value of the size of the oligomer population (RH = 5.2 ± 0.1 nm) (Fig. 6c). To complement the analysis demonstrated here based on varying the minimum number of fluorescence photon thresholds in the burst search algorithm, we also performed size analysis of α-synuclein oligomers by selecting defined regions in the burst intensity histogram as done for HSA. The results are shown in Supplementary Fig. 4 and are in very good agreement with the results obtained by varying the fluorescence photon thresholds value (c.f. Figure 6c). Taken together, our analyses here demonstrate the versatility of smMDS in resolving the size distributions of a heterogenous oligomeric protein samples.Fig. 6: Sizing of heterogenous oligomer populations by smMDS.a A heterogenous mixture of α-synuclein oligomers (10 pM) was probed in a single-step scan smMDS measurement and single-molecule burst events were extracted using the burst search algorithm. To differentiate between differently sized assembly states of α-synuclein oligomers, the value for the minimum number of fluorescence photons threshold (Nmin) was varied, while keeping the inter-photon time threshold constant. This allowed for the creation of burst intensity distributions, which differ in molecular brightness. Exemplary burst intensity distributions for four different Nmin threshold values are shown (light blue: 5 photons, orange: 20 photons, green: 30 photons, red: 47 photons). The inset displays burst intensity histograms in semi-log scale. Intensities are normalized intensities with respect to burst duration. b Exemplary diffusion profiles generated from burst intensity distributions with four different minimum number of fluorescence photons threshold values (panels are color-coded according to the specific threshold values used in panel a). Diffusion profiles are shown as blue lines, experimental fits as orange lines, and error as green bands. Extracted hydrodynamic radii RH [with errors] are given as insets. For definitions of errors, please refer to the legend of Fig. 1b. c Extracted RH of oligomer subspecies displayed as a function of the different minimum number of photon threshold values used in the burst search algorithm. Data represent extracted RH [with errors], as reported in panel b. The colored vertical bars indicate threshold values used in panels a and b. The horizonal dashed line indicates the ensemble-averaged size of the entire oligomer population, generated from a diffusion profile from all bursts obtained from the measurement. RH of monomeric α-synuclein, as measured by smMDS, is indicated as a dotted horizontal line. Source data are provided as a Source Data file.Sizing of multiple species within a heterogenous aggregation mixture by smMDSMany protein systems are heterogeneous, multicomponent mixtures consisting of proteins and protein assemblies that differ in size by several orders of magnitude. For example, aggregation mixtures are made up of monomeric protein and large fibrillar species3,47. Often, one of the components (e.g., the monomeric protein) is present in large excess, while the other one (e.g., fibrillar species) is only present in small amounts. Approaches that can quantify the sizes of such differently populated species are much sought after yet lacking. smMDS can fill this gap as it has the capability to size molecules and assemblies in heterogeneous mixtures even when an excess of one of the molecular species is present at bulk levels.To demonstrate the potential of sizing protein mixtures that are compositionally heterogenous, we set out experiments with a sample system composed of fibrils formed by the protein α-synuclein, a key component in the pathology of Parkinson’s disease108,109, and an excess of monomeric α-synuclein at nanomolar concentrations (Fig. 7a). Such a mixture is often encountered in assays that probe the mechanisms underlying protein aggregation and amyloid formation47,110. We first performed continuous scanning experiments on pure α-synuclein fibrils (at 10 nM monomer equivalent concentration) and pure α-synuclein monomer (at 10 nM concentration) to establish the signature of the two species (Fig. 7b). Notably, within the fibril sample, only 10% of the monomers are fluorescently labeled (see Methods). Similarly, only a fraction of 10% of labeled protein is present in the monomer sample, thus ensuring concentration parity with the fibrils and facilitating direct comparisons between the two.Fig. 7: Sizing of multiple species within a heterogenous aggregation mixture by smMDS.a Schematic of an aggregation reaction composed of monomeric α-synuclein and fibrillar species. b Sizing of α-synuclein fibrils in the presence of an excess of monomeric α-synuclein. Continuous scan diffusion profiles (left panel) for pure monomeric α-synuclein (10 nM) (blue), α-synuclein fibrils (10 nM monomer equivalent) (green), and a mixture of α-synuclein fibrils (9 nM monomer equivalent) and monomeric α-synuclein (1 nM) (pink). The right panel is a zoom-in as indicated by a dashed box in the left panel. Bursts correspond to the passing of fibrils through the confocal detection volume. c Step scan measurement of a mixture of α-synuclein fibrils (9 nM monomer equivalent) and monomeric α-synuclein (1 nM). The top panel shows an exemplary fluorescence time trace (1-ms binning) at diffusion profile position 340 µm, as indicated. An intensity threshold was applied to separate signal from fibrils (red) and monomer (purple). The bottom panels show diffusion profiles created from the fibril and monomer signals, respectively. Diffusion profiles are shown as blue lines, experimental fits as orange lines, and errors as green bands. Extracted hydrodynamic radii RH [with errors] are given as insets. For definitions of errors, please refer to the legend of Fig. 1b. d Comparison of extracted sizes from triplicate step scan measurements. Shown are RH of species extracted from a mixture of α-synuclein fibrils (9 nM monomer equivalent) and 1 nM monomeric α-synuclein (red and purple, respectively), pure monomeric α-synuclein (blue), 10 nM monomer equivalent of α-synuclein fibrils (green). Data points (mean) were obtained from triplicate measurements. Error bars denote standard deviations. Step scan measurements of pure α-synuclein (10 nM) (panel e) and pure fibrils (10 nM monomer equivalent) (panel f). The top panels show exemplary fluorescence time traces (1-ms binning) at diffusion profile positions 338 µm and 340 µm, respectively. Diffusion profiles are shown as blue lines, experimental fits as orange lines, and errors as green bands. Extracted RH [with errors] are given as insets. For definitions of errors, please refer to the legend of Fig. 1b. Source data are provided as a Source Data file.For the fibril-only sample (Fig. 7b, green profile), we observed burst events of high fluorescence intensity that were narrowly distributed around the center of the channels. The high burst intensity stems from the large number of fluorophores that are contained in a single fibril (>10% of the monomers are fluorescently labelled within fibrils, see Methods). The narrow distribution of bursts located at the center of the channel indicates a low diffusion coefficient and correspondingly a large size, as expected for fibrillar aggregates. For the monomer sample (Fig. 7b, blue profile), the sizing profiles exhibited a wider spread. This is attributed to the higher diffusivity of monomers compared to fibrils. The monomer signal is continuous because nanomolar concentrations are used, and therefore multiple monomeric units traverse the confocal detection volume at the same time, resulting in a bulk fluorescence signal contrasting with the discrete single-molecule events observed in the fibril sample. In addition to establishing the signatures of fibrillar and monomeric samples, we probed a sample mix containing α-synuclein fibrils and an excess of the monomeric protein (Fig. 7b, pink profile). The diffusion profile of the mixture exhibited characteristic signatures for both fibrils and monomeric protein, with broadened fluorescence at the profile base, reflecting monomeric protein, in addition to bright bursts on top of the monomeric signal that were narrowly distributed at the center of the channel, reflecting signals from fibrils.To demonstrate that smMDS is able to size both the monomeric subpopulation present at bulk levels and the fibrils present at single-particle concentrations, we performed step scan measurements with a mix containing α-synuclein fibrils and an excess of the monomeric protein (Fig. 7c). An example fluorescence time trace is shown in Fig. 7c (top panel). Fibrils are clearly detectable as bursts above the mean signal, which corresponds to the bulk monomer signal. From these traces, we separated the bulk monomer signal from the fibril burst signals by intensity thresholding. Specifically, fibrils were detected as bursts that exhibit a fluorescence count rate of >250 kHz, after applying a Savitzky-Golay smoothing filter. The remaining signal (i.e., the mean bulk signal in the fluorescence time traces in the absence of fibril signal) formed the signal for the monomer. From the extracted fibril and monomer signal, we created diffusion profiles for the two species (Fig. 7c, bottom panels) and subjected these profiles to fitting using our advection–diffusion model to extract size information of the two species. The sizes of monomer and fibrils species, from triplicate measurements (Fig. 7d), were estimated to be RH,monomer = 3.23 ± 0.04 nm and RH,fibrils = 56.43 ± 6.69 nm. As a control, we also performed step scan measurements for the fibril-only and monomer-only sample for comparison (Fig. 7e, f, respectively) and obtained sizes which were, within error, in excellent agreement with the ones obtained from the sample mix, thereby validating our approach (Fig. 7d).Together, these experiments here show that smMDS has the capability to quantify differently sized molecules or assembly states of a protein within a heterogeneous mixture, even when an excess of one of the molecular species is present at bulk levels. These findings are significant as such an approach allows for the simultaneous probing of differently populated species, for example, in kinetic protein misfolding and aggregation studies.Sizing of nanoscale clusters by smMDSIn a final set of experiments, we applied the smMDS approach to the characterization of nanometer-sized clusters of a phase separating protein system. Biomolecular condensates (Fig. 8a) formed through phase separation are important players in cellular physiology and disease111,112, and emerge from the demixing of a solution into a condensed, dense phase and a well-mixed, dilute phase113,114. Condensates typically have sizes in the micrometer range and are easily observable by conventional microscopy imaging115,116,117. Recent evidence suggests that phase separation-prone proteins, such as the DNA/RNA binding protein fused in sarcoma (FUS), can also form nanoscale assemblies (Fig. 8a), well-below the critical concentration at which phase separation occurs (i.e., pre-phase separating regime)118,119,120,121. These so-called nanoscale clusters have sizes in the tens to hundreds nanometer regime, and thus are beyond the resolution of conventional optical imaging systems, meaning that their precise quantification of the cluster dimensions is infeasible. Moreover, as these species are low in abundance and present in a high background of dilute phase protein concentration, they are typically hard to detect by conventional wide-field imaging approaches. This is because wide-field fluorescence microscopy inherently captures significant out-of-focus light from the entire sample volume. Due to the high concentration of protein in the dilute phase, this leads to a high background signal that effectively masked the distinct fluorescence signal emanating from the clusters. Consequently, these clusters, while theoretically visible, become indistinguishable from the overwhelming background noise in standard wide-field epifluorescence images. This issue can be circumvented with confocal fluorescence detection, which utilizes optical sectioning to restrict detection to the in-focus plane only. Here, we showcase how smMDS, leveraging confocal detection to achieve nanocluster detection sensitivity, enables sizing of nanoscale assemblies of TAR DNA binding protein 43 (TDP-43) at sub-saturating concentrations directly in solution. Additionally, we illustrate smMDS’ efficacy in assessing the abundance and composition of these nanoclusters.Fig. 8: Sizing of nanoscale clusters by smMDS.a Schematic of TDP-43 phase separation and the formation of nanoscale clusters in the pre-phase separating regime. b Phase separation behavior of GFP-tagged TDP-43 as a function of KCl concentration as observed by widefield microscopy imaging. The phase diagram (left panel) was generated from measurements at five KCl concentrations and at 0.5 µM protein concentration. Representative images at 100 mM and 25 mM KCl are shown (right panels). ccrit denotes the critical KCl concentration. Experiments were repeated at least three times with similar results. Scale bar is 10 µm. c Continuous scan diffusion profiles for 0.5 µM GFP-tagged TDP-43 at 100 mM KCl. The upper panel shows the diffusion profile as obtained from a continuous scan measurement. Bright bursts indicate nanoclusters passing through the confocal detection volume. The bottom panel is a re-binned diffusion profile to extract the size of monomeric TDP-43. Diffusion profiles are shown as blue lines, experimental fits as orange lines, and error as green bands. The extracted RH [with errors] is given as an inset. d Exemplary fluorescence time trace (1-ms binning) from a step scan measurement at channel position 960 µm, as indicated in panel e. Nanoclusters were detected as bursts that exhibit a signal >5 standard deviations above the mean. Detection events are highlighted in red. e Total intensity of a segmented step scan across the chip (top panel) and histogram of detected nanocluster events as a function of chip position (bottom panel). Gaussian distributions were fit to each peak to extract a mean diffusion distance at each channel position. f Plot of mean diffusion distance versus time of travel within the channel. The inset graphically shows how diffusion distances were determined. The diffusion distance corresponds to the half of the full-width half maximum (FWHM) of the Gaussian distributions at each measurement point. The width at timepoint zero was used for normalization. Data points (mean) are from three repeats; error bars indicate standard deviations. The orange line shows the fit according to Eq. 1. The extracted average RH of TDP-43 nanoclusters is given as an inset (mean ± standard deviation).First, we mapped out a one-dimensional phase diagram of the protein TDP-43 (Fig. 8b) to assess the phase separation behavior of the protein with respect to changes in salt concentration. GFP-tagged TDP-43 at 0.5 µM protein concentration formed microscopically visible condensates (1–2 µm in diameter) below a critical salt concentration ccrit of 50 mM KCl. No condensates were visible by conventional fluorescence microscopy above that salt concentration and the solution appeared clear and well-mixed. Next, in order to assess whether TDP-43 forms nanoscale assemblies, we performed smMDS measurements at conditions where no microscopically visible condensates could be detected (i.e., well above ccrit). To this end, we first performed a continuous scan experiment at 100 mM KCl. The obtained profile, shown in Fig. 8c (top panel), exhibited a broad spread signature, which is characteristic for bulk monomeric protein. Sizing of the profile (Fig. 8c, bottom panel) yielded a hydrodynamic radius of RH = 4.29 ± 0.8 nm, which is in agreement with the size of monomeric GFP-tagged TDP-43, as predicted by the HullRad model122. More importantly, in addition to the characteristic signature for monomeric protein in the continuous scan diffusion profile, we observed bright bursts that were narrowly distributed at the center of the channel on top of the diffusion profile, indicating the presence of clusters (Fig. 8c, top panel). To explore this further, we carried out smMDS step scan measurements. We performed high-resolution step scans with an interval of only 1 µm between steps within the central region of the diffusion profile where clusters would appear. Outside these regions we performed step scans with lower resolution (Fig. 8e, top panel). Clusters were clearly detectable as bursts above the mean bulk signal in the fluorescence time traces (Fig. 8d). These bursts were then counted to give us the number of clusters present at each position in the channel and binned in a histogram (Fig. 8e, bottom panel). Each peak in the histogram was then independently fit to a Gaussian distribution to obtain a mean diffusion distance that could be utilized to calculate D and, thus, RH. Specifically, we extracted the half of the full-width half maximum (FWHM) of the Gaussians as the diffused distance at the four channel positions, corresponding to 0, 10, 26, and 55 s of travel within the channel. The diffusion distances x at each time point t (Fig. 8f) were then fitted with a one-dimensional solution to Fick’s second law$$x \, \approx \, \sqrt{2{Dt}}$$
(1)
to extract D and, thus, RH via the Stokes–Einstein relation. We performed this analysis with three independent measurements and obtained an average RH of 120 ± 10 nm for TDP-43 nanoclusters (Fig. 8f). We note that the sizes of clusters are similar to FUS clusters previously observed118, thus indicating that TDP-43 forms similar pre-phase separation clusters as FUS.The simultaneous, yet independent measurement of the sizes of nanoclusters and monomeric protein further allows estimating the number of monomer units per nanocluster. This is done through comparison of the volume ratios of monomeric TDP-43 and the clustered form. Assuming no restructuring of the protein within the nanocluster, a single cluster could contain as much as 20,000 proteins if the cluster is composed of pure protein. However, as condensates are liquid in nature and contain solvent molecules, typical volume fractions of proteins within condensate systems are on the order of ~10–35%123,124,125; hence, we expect the number of proteins per cluster to be in the range of 2000–7000. In addition to size measurements, the ability to directly count clusters in a digital manner also enables the quantification of cluster particle concentrations and volume fractions. From three repeat measurements, we detected an average number of clusters of N = 2281 ± 929. Using a previously established conversion strategy60, this corresponds to a flux of F = 72,606 ± 29,570 clusters per second or a cluster particle concentration of c = 7.24 ± 2.9 pM, corresponding to a total nanocluster volume fraction of ϕ = 3.16 ∙ 10–5. The concentration of TDP-43 nanoclusters detected here was therefore more than an order of magnitude higher than previously determined for FUS nanoclusters formed under the same protein and salt concentrations118, suggesting a difference in the intermolar interactions that stabilize TDP-43 nanoclusters. Notably, TDP-43 is prone to aggregation and possesses a disordered domain capable of forming amyloid fibrils126. These characteristics may contribute to enhanced intermolecular interactions also in the clustered state, and potentially also explain the higher propensity of TDP-43 to form nanocluster assemblies.Taken together, we have shown here that smMDS constitutes a solution-based biophysical analysis approach able to size pre-phase separation nanoclusters that are undetectable by conventional fluorescence microscopy. This places smMDS alongside other advanced microscopy techniques like super-resolution microscopy, which are invaluable for visualizing nanoclusters within cellular environments127,128,129, and mass photometry for label-free characterization of nanoclusters130. Moreover, the discovery of TDP-43 nanoclusters and the understanding of the nature of such sub-diffraction assemblies is critical, in particular, for progressing our understanding of macroscopic phase separation phenomena. Our single-molecule sizing approach therefore provides insight into a largely unexplored area of protein assembly, taking advantage of the capability to elucidate properties of low abundance nanoscale species present in a biomolecular condensate system.

Hot Topics

Related Articles