Gas permeability, diffusivity, and solubility in polymers: Simulation-experiment data fusion and multi-task machine learning

Experimental data acquisitionMeasured gas transport properties (permeability, diffusivity, and solubility) for six different gases (CO2, CH4, O2, N2, H2, and He) were obtained from 84 publications listed in the Polymer Handbook32. The experimental testing temperatures ranged from 25 °C to 35 °C, and testing pressures varied between 1 and 30 atm. The dataset comprised a total of 820 polymers and included 3748, 709, and 550 Pexpt, Dexpt, and Sexpt values, respectively, amounting to a total of 5007 data points. Factors such as polymer process history and testing method were not directly included as parameters. Instead, the measured Pexpt, Dexpt, and Sexpt values are treated as samples from the distribution of possible values for a given polymer. As such, it is important to consider the uncertainty in the predictions, and not just the mean value of the prediction.Molecular Dynamics and Monte Carlo simulationsGas diffusivity and solubility data were generated using classical molecular dynamics (MD) and Monte Carlo (MC) simulations, respectively. These simulations were conducted using the open-source large atomic molecular massively parallel simulator (LAMMPS) package33. The atomic potential parameters for polymers were adopted from the general AMBER force field 2 (GAFF2)34. In the simulations, the gas molecules (i.e., CO2, CH4, O2, and N2) were treated as rigid molecules, and thus were modeled with non-bonded potentials described by the TraPPE (transferable potentials for phase equilibria) models35. To perform the simulations, 27 polymer chains were inserted into the simulation box, with each chain comprising of approximately 150 atoms, and their ends were capped with a methyl group. The initial polymer configurations were generated using the Polymer Structure Predictor (PSP) package36, and a representative snapshot is shown in Fig. 1b. To achieve equilibrated structures, all systems underwent a 21-step relaxation procedure as recommended by Abbott et al.37. The mean-squared displacement (MSD) of the polymers was then computed, and movement beyond a few times the distance of the radius of gyration on average was assessed. This step ensured that the polymers explored various conformations and reached an equilibrium conformational state and density. Once the equilibrated structures were obtained, Dsim and Ssim were calculated. The simulation protocol is outlined in Fig. 1b.For the Dsim calculations, a total of 27 gas molecules were randomly added to the simulation box. This specific number of molecules was chosen to be small enough to maintain the system in the dilute Fickian regime such that the gas molecules do not significantly influence each other, and yet large enough to obtain meaningful statistics. Subsequently, all systems underwent an additional equilibration of 10 ns in the NPT ensemble, followed by a 100–200 ns production run in the NVT ensemble. The choice of a 100–200 ns production run duration was made to ensure the convergence of gas diffusivity and gas MSD slope across a broad spectrum of polymer types. While shorter time frames are adequate for certain instances, there are cases where the extended range of 100–200 ns is necessary to achieve the desired level of convergence. To illustrate this behavior, we present an analysis of simulation time versus methane diffusivity for polyethylene, polyimide, polystyrene, and polymethyl methacrylate, with the results detailed in Supplementary Figure S1. The box size in the NVT run was fixed using the average spacing and density obtained from the last 1 ns of the NPT run. Nosé-Hoover thermostat and barostat were employed with a damping parameter of 100 time steps for each, and a time step of 1 fs was used in all MD simulations. The barostat coupled the three dimensions of the box to maintain a cubic box for all systems. Simulation outputs were saved every 1000 fs and block averaging from one polymer configuration was used to calculate an average Dsim and standard deviation from the gas MSD. Block averaging allows for the reduction of random noise and more reliable statistical measures38.For the Ssim calculations, a 5 ns production run was performed on equilibrated structures in an NVT ensemble. During this 5 ns run, a snapshot of the structure was captured every 100 ps, resulting in a total of 50 snapshots. Employing an ensemble of snapshots allows for improved sampling and a standard error, which is crucial for accurate estimation of Ssim39. Using a built-in LAMMPS function (https://docs.lammps.org/fix_widom.html), 25,000 gas particles were inserted per snapshot, at random positions, following the Widom insertion method40. This method involves determining the excess chemical potential resulting from the insertion of gas molecules into the polymer, which allows for the estimation of Henry’s constant. Henry’s constant indicates how easily a particular gas dissolves in the polymer. Henry’s Law is then used to obtain gas solubility from Henry’s constant, with an assumption of a partial pressure equal to 1 atm, which is the IUPAC standard testing condition41. This derivation is detailed in the methods section. No relaxation was performed to adjust the positions of the polymer atoms or the gas particles during the insertion process. Langevin thermostat was used with a time step of 1 fs for all MC simulations. 25 polymer configurations were used to calculate the Ssim, standard deviation, and the standard error from the excess chemical potential.Figure 1b provides an overview of the simulation protocol used, and details of Dsim estimation from gas MSD and Ssim from the excess chemical potential are described in the Methods section.Validation of MD and MC simulationsAs an essential step of this investigation, we aimed to validate and calibrate the accuracy of the MD and MC predictions and assess the extent to which the simulations capture trends in gas transport properties. Performing classical simulations with a specific force field for polymer-gas systems across extensive chemical spaces to estimate gas diffusivity and solubility is a relatively rare endeavor. While generic force fields like GAFF2 are designed for a wide variety of materials, they often require fine-tuning of potential parameters for each unique material to attain better accuracy.A total of 584 polymer-gas systems were simulated, out of which 342 systems had corresponding experimental measurements. The additional simulated systems were intended to expand the chemical coverage of the model. A comparison of Psim, Dsim, and Ssim against their respective experimental values, Pexpt, Dexpt, and Sexpt, is illustrated in Fig. 2. Overall, the simulations tend to overestimate the measured values, but they effectively capture the general trends across the polymer-gas chemical space considered. The overestimation of Dsim values, especially in low diffusivity regimes, can be attributed to the difficulty of classical force fields to accurately capture rare events and handle large chemical spaces42,43. More specifically, the simulated polymer systems often exhibit lower densities compared to experimental systems, as modeled systems are approximations of the real polymeric materials and may include lower molecular weights and limited equilibration times. In our methodology, we employ a 21-step polymer equilibration relaxation procedure, which results in consistent density trends compared to experimental systems. However, a slight underestimation of density remains, as also observed by Abbott et al. in their study employing the same procedure37. This increased free volume allows gas molecules to move more easily and quickly through the polymer system, resulting in higher diffusivity.Fig. 2: Validation of gas transport simulations.a Gas permeability parity plot, b Gas diffusivity parity plot, and c Gas solubility parity plot. Parity plots comparing the results from simulations against experiment data. Simulated gas permeability was derived using Eq. 1, using simulated gas diffusivity and solubility as inputs. The red lines represent trends in predicted data, while the black lines depict the parity lines of optimal fit. The error bars for all plots are represented in standard deviations. Error propagation techniques were employed to calculate the error bars for gas permeability. While some overestimation is expected across all cases, a qualitative correlation is demonstrated.Similarly, the discrepancies of Ssim relative to Sexpt may be due to the approximations inherent to the Widom insertion approach and the quality of the classical force fields across chemical spaces. Nonetheless, the favorable trends that the force fields can capture provide optimism for the usage of such simulation-derived datasets, albeit with lower fidelity, in multi-task learning frameworks. Another essential aspect of the validation is the derivation of Psim, from the product of Dsim and Ssim using Eq. 1. While non-equilibrium MD can be used to simulate Psim, it requires a more complex setup and can be computationally intensive.Multi-task model benchmarkTo elucidate the effect of data fusion, we train and compare both ST and MT polyGNN models, using a subset of the experimental data collected and simulation data generated. These models were evaluated based on the predictive accuracy of Pexpt, using various holdout train and test splits of 293 systems (comprised of 80 unique polymers with varying available gas data). For instance, in a 20/80 split, 20% of the Pexpt data is set aside as testing data, while 80% is used to train the model. To ensure representative data sampling, stratified sampling based on polymer SMILES was used when splitting the data into train and test sets. In this type of sampling, when a polymer is selected for the test set, all gas data for Pexpt associated with that polymer are withheld from the training set. This also provides insight into how well the model extrapolates to new unknown polymers. The polyGNN model training parameters used are detailed in Supplementary Table S1.In Fig. 3, we illustrate the two model types, ST and MT, along with the details of the train and test splits. The performance of the models was evaluated using two key metrics: the coefficient of determination (R2) and the order of magnitude error (OME)–units in Barrer. R2 assesses how well a model predicts an outcome, while OME quantifies the prediction error in terms of orders of magnitude (taken as the logarithm of the mean absolute error). We conducted four random seed selections of the training and test sets to compute the statistics of the model performance.Fig. 3: Comparison of single task and multi-task learning models.a Model inputs. This schema illustrates the train and test splits for two model variants: Regular Single Task (ST) and data-fused Multi-Task (MT) models. In the ST models, solely experimental gas permeability data is incorporated. Conversely, the MT model encompasses a possible amalgamation of experimental gas diffusivity and solubility, along with simulated gas permeability, diffusivity, and solubility data. b Benchmark models. Four distinct models were developed to assess the impact of MT learning. The first model (ST) exclusively incorporated experimental gas permeability data. In contrast, the subsequent MT models progressively integrated additional data. The presence of each data type in the model is indicated by a black checkmark. Here, P, D, and S represent gas permeability, diffusivity, and solubility, respectively. The abbreviation expt corresponds to experimental data, while sim signifies simulation data.Our MT learning methodology comprises two primary components: the integration of simulation data and the inclusion of correlated experimental data. To establish a baseline for comparison, we employ a ST model. Shown in Fig. 3a and represented by the “ST” row in Fig. 3b, the ST model is exclusively trained using Pexpt data. Due to its reliance on limited data and the absence of diverse property inputs, the ST model’s coverage of the chemical space is inherently constrained. As the test set percentage increases, this model is trained on progressively reduced amounts of data, leading to an anticipated decrease in predictive performance. This trend is evident in Fig. 4 where the R2 decreases and the OME increases as the ST model is trained on diminishing data portions. In the most challenging scenario (80% test set size), the R2 dropped to less than 0.50 and the OME increased to ≈0.44 Barrer.Fig. 4: Predictions of Permeability at various train test splits.a Coefficient of determination (R2). b Order of magnitude error (OME). R2 evaluates the predictive performance of a model, whereas OME measures the prediction error by considering orders of magnitude, represented as the logarithm of the mean absolute error. The ST and MT models are compared based on varying percentages of the unseen test set. The different test set sizes illustrate the impact of reducing training data. At 80%, the model is trained on only 20% of the dataset and tested on the remaining 80%, reflecting a data-scarce region with limited chemical coverage. Comparatively, the MT models show significant improvement over the ST model, particularly at higher percentages of the unseen test set.Now let’s consider the first element of our MT learning approach, specifically the augmentation of Pexpt training data with Psim, represented by the “MT-1” row in Fig. 3. The MT-1 model is enriched with simulation data spanning the test set space. Its primary purpose is to exploit the correlations between measured and simulated data learned from the training set. This scenario mirrors situations where experimental data is unavailable, and simulation data is introduced to guide the model’s predictions. Upon examining the MT-1 model, its performance noticeably surpasses that of the baseline ST model. The MT-1 model achieves an average R2 and OME of ≈0.77 and ≈0.30, respectively, as shown in Fig. 4 (MT-1). This improvement is particularly pronounced when the test set size reaches 80%, where the coverage of experimental data within the chemical space is most limited. This accentuates the ability of data fusion models, reinforced with simulation data, to effectively mitigate the challenges of extrapolation that conventional models (trained solely on a single experimental property) would inevitably confront. Furthermore, as another demonstration, this analysis was extended to experimental and simulation data for gas diffusivity, resulting in a similar strengthening in performance, as illustrated in Supplementary Figure S2. This observation underlines the value of bolstering experimental data with simulation data, indicating its potential extension to other properties of interest as well.Moving on to the second component of our MT learning methodology, we focus on augmenting the Pexpt training data with Dexpt and Sexpt, represented by the “MT-2” row in Fig. 3b. The inclusion of this supplementary data serves the purpose of empowering the model to leverage knowledge from other available pertinent properties and established physics and make predictions for the Pexpt values. In this scenario, a remarkable enhancement is observed, leading to a significant boost in predictive performance. Specifically, the average R2 and OME is ≈0.93 and ≈0.12, respectively, as displayed in Fig. 4 (MT-2). Comparing the MT learning component in the previous passage with this second component reveals a notable difference in performance. While both approaches expand the coverage of the chemical space, MT-2 stands out due to the incorporation of high-fidelity experimental data. Unlike MT-1, where all augmented data comes from simulation, the new information in MT-2 originates from additional experimental sources, contributing to superior predictive capabilities. The MT-2 model can be likened to an ideal scenario where complementary or correlated high fidelity data is readily available. In scenarios where such ideal conditions are not met, the MT-1 approach excels by effectively integrating simulation data to achieve a respectable level of prediction accuracy.In our final model, we combine the strategies embedded in both the MT-1 and MT-2 models, creating a unified model represented by row “MT-3” in Fig. 3b. This comprehensive model encompasses all available experimental and simulation data points. The performance of the MT-3 model slightly outperforms that of the MT-2 model, exhibiting an elevated average R2 of ≈0.96 and a comparable average OME of ≈0.10, as depicted in Fig. 4 (MT-3). Overall, this model achieves superior performance compared to the base ST model, which had an average R2 and OME is ≈0.57 and ≈0.38, respectively. These results establish the efficacy of integrating simulation and correlated experimental data in successfully addressing the challenges posed by ML extrapolation.Production model benchmarkIn the first iteration of our gas permeability prediction work, deployed at Polymer Genome (https://www.polymergenome.org), a Gaussian process regression algorithm was employed alongside a hierarchical polymer fingerprinting scheme to train a ST model11. In the present work, a transition is made to polyGNN (a recently published Graph Neural Network model that automatically generates fingerprints from SMILES strings), data augmentation, and invariant transformations to train a MT model. The models presented in the preceding section were trained using a subset of our dataset, a deliberate choice made to clearly illustrate the impact of incorporating diverse data types on prediction capabilities in a multi-task setting. Our final production model adopts the MT-3 model scheme and now incorporates all the available experimental and simulation data for gas permeability, diffusivity, and solubility. With this latest model iteration, our objective is to achieve substantial improvements over the previous version and to push the boundaries of transport predictions through polymers. The principal component analysis (PCA) plot in Fig. 5a, created using Polymer Genome fingerprints, displays the chemical space of the present study against 13,000 known polymers in our database. This plot visually demonstrates our production model’s expansion to include additional chemical compositions, increasing the range of polymers for which the model can make accurate predictions. We also present a PCA plot in Supplementary Figure S3, illustrating the chemical coverage of our simulation data in comparison to experimental gas transport data and 13,000 known polymers in our database. Figure 5b highlights the data fusion aspect of the model, showcasing the contrast between the datasets employed in the original and current models. Particularly noteworthy is the considerable enlargement of our dataset, expanding from 315 to 1052 polymers, accompanied by a significant increase in the total number of data points. With this amplified dataset, our model gains the capability to not only predict gas permeability but also include gas diffusivity and solubility. This broader scope of predictions reflects the power of our MT learning approach and its ability to leverage diverse data sources for a more comprehensive understanding of gas transport properties through polymers.Fig. 5: Comparison of chemical space and data coverage by the original and production models.a Principal Component Analysis (PCA) plot. The PCA plot demonstrates an expanded coverage of chemical space by both the original and production models. The orange and blue dots correspond to the coverage of the original and production model, respectively, while the grey dots represent the 13,000 known polymers in our database. b Dataset comparison. A comparison between the original and production models reveals an incorporation of diverse data types. The production model integrates experimental and simulation data for permeability, diffusivity, and solubility properties.To highlight the superior performance of the present model, Pexpt predictions were made on a holdout test set of 153 systems, consisting of 31 polymers across 13 polymer classes, following a similar approach as the original model11. The summarized outcomes are presented in Table 1, and the specific polymers selected for this assessment are listed in Supplementary Table S2. The overall R2 has increased from 0.93 to 0.96 in the updated model compared to the original. Upon a more detailed examination of individual polymer classes, it becomes evident that the R2 metric exceeds 90% for all classes for the production model, with a particularly significant enhancement observed for polyphosphazenes, where the R2 value has risen from 0.49 to 0.86. Additionally, substantial advancements have been achieved in prediction accuracy for polymers such as polynorbornenes, polypropynes, substituted polyacetylenes, and polypentynes. The diminished performance of the original model in these cases could be attributed to either limited data availability for certain polymer classes or inherent uncertainties within the experimental data. Importantly, it should be noted that the test data points for these specific polymer classes vary widely, ranging from 4 to 53 data points. This variability in data availability across diverse classes could potentially contribute to lower individual R2 values for specific classes while concurrently contributing to a higher overall model R2. Nonetheless, the updated model effectively overcomes these performance variations, highlighting its robustness and versatility. Further insights into the model’s performance are depicted in parity plots showcasing train and test set predictions for the 31 evaluated polymers, shown in Supplementary Figures S4 and S5.Table 1 Benchmarking model performancesForward-looking designThe ideal performance of gas separation membranes is related to two intrinsic material properties: the gas permeability and the permselectivity between specific target gas pairs. Ideally, a membrane would provide high permeability and permselectivity to maximize throughput and minimize costs. In 1991, Robeson44 documented a trade-off relationship between these two characteristics for polymers, often referred to as “the upper bound”. This principle asserts that polymers with high permeability typically exhibit diminished selectivity, and vice versa. These upper bounds illustrate the trade-off relationship for pairs of common gases (CO2, CH4, O2, N2, H2, and He), highlighting the best possible combination of permeability and permselectivity. This upper bound establishes a comparative benchmark for evaluating the performance metrics when designing novel membranes. As such, data driven methods that establish a relationship between polymer structure and polymer membrane performance hold immense potential in accelerating the design of tailor-made polymers for specific separation tasks.To this end, we demonstrate our model’s capability to make these assessments. We constructed permeability trade-off plots for ≈13,000 known polymers (i.e., previously synthesized) for the gas pairs; CO2/CH4, CO2/N2, H2/CH4, H2/CO2, O2/N2, and N2/CH4. Figure 6a shows the permeability trade-off plot for CO2/CH4, while the other gas pairs are shown in Supplementary Figure S6. The ML predicted gas pair permeability and selectivity closely align with the available experimental data and the bounds, while simulation data over-predicts as expected. Both experimental and simulation data are also shown in Fig. 6a. By predicting property values for the ≈13,000 known polymers, we can gain a clearer understanding of the overall trade-off behaviors. Robeson’s upper bound, initially established in 1991, is presented alongside updated bounds introduced in 2008 and 201945,46. PIM-DM-BTrip, a polymer with superior performance, is highlighted as a part of the set of polymers that helped define the 2019 bound.Fig. 6: Gas transport trade-off plots.a Gas permeability, b Gas diffusivity, and c Gas solubility. Using our model to predict ≈13,000 known polymers, we compare the results to experimental data. The orange and blue dots correspond to the experimental data and machine learning predictions, respectively, while the grey dots represent simulated values. The original Robeson upper bound (1991) is depicted as the dashed black line, with reevaluated bounds from 2008 and 2019 shown as red and yellow dashed lines, respectively, for gas permeability. There are no established bounds for gas diffusivity or solubility, but the model predictions closely align with the experimental data values. In the case of CO2/CH4 diffusivity selectivity, the low diffusivity regime has high prediction uncertainty and should be taken with caution.Research endeavors commonly focus on permeability trade-off plots, however as permeability can be broken down into diffusivity and solubility components, we also created CO2/CH4 trade-off plots for these properties, as shown in Fig. 6b, c. Diffusivity and solubility trade-off plots for CO2/N2, O2/N2, and N2/CH4 are illustrated in Supplementary Fig. S7 and S8. When using these models, the sensibility of predictions can be evaluated by observing common trends for the properties. For example, gas diffusivity tends to follow the relationship of DO2 > DCO2 > DN2 > DCH4, a pattern primarily driven by the molecular diameter effects. However, Fig. 6b, illustrates instances where the CO2/CH4 diffusivity selectivity falls below 1 (i.e., below 0 in the log scale). This contradicts the intuition that CO2 diffusivity should almost always be greater than CH4. Although there are cases where DCO2/DCH4 < 1, it is a rare occurrence. A closer examination of these suspicious predictions reveals that most of them fall in the lower diffusivity regime. In this regime, prediction uncertainty, calculated using Monte Carlo dropout, tends to be inflated, revealing lower confidence for predictions. This heightened uncertainty can be directly attributed to the scarcity of data in this specific range, a challenge that is particularly pronounced in both simulations and experimental measurements. Indeed, as can be seen from Fig. 6b, there are no measured or simulation data points in this property range, and hence, the ML predictions must be viewed with extreme caution and suspicion. This underscores the importance of recognizing that these models are valuable tools, but they must be used in conjunction with chemical intuition and an understanding of prediction uncertainties, especially for predictions in regions far away from the chemical space of the training set. This becomes especially critical when assessing areas with limited data or when venturing into new domains. These considerations thus mandate either experiments or simulations in such unexplored chemical spaces to better inform the ML models.Trade-off plots are typically employed in designing amorphous polymers for gas separation, but when considering other applications such as packaging, the degree of crystallinity of the polymer must be considered. Gas transport behavior in semi-crystalline polymers varies due to the crystalline regions acting as impermeable barriers against gas penetration47. Michaels et al. originally described this behavior using a two-phase model, which comprises a crystalline phase and an amorphous phase, where impedance is directly proportional to crystallinity48. Weinkauf et al. extended this model into a three-phase model that incorporates the ratio between the rigid amorphous phase fraction and mobile amorphous phase fraction (RAF/MAF)47. These models provide critical insights into the behavior of semi-crystalline polymers and offer guidelines for tailoring their gas transport properties to specific applications. The present work may be extended to address such practical situations by planning simulations of gas diffusivity and solubility through amorphous, crystalline, and amorphous-crystalline interfaces.

Hot Topics

Related Articles