Active learning streamlines development of high performance catalysts for higher alcohol synthesis

Overview and scope of the active learning frameworkWe devised an active learning approach by integrating data-driven algorithms with experimental workflows, which continuously learns from existing and newly-generated data from iterative experimental cycles, to explore and identify FeCoCuZr compositions and reaction conditions optimizing catalyst performance metric(s) of interest (Fig. 1)36,37,46. The core of the data-driven model combines Gaussian process (GP) and Bayesian optimization (BO) algorithms, along with human decision-making in order to accomplish single or multi-objective tasks46,47.Fig. 1: Scheme of active learning workflow to develop FeCoCuZr catalysts.The study is developed in three phases with progressive increases in the number of target performance metrics and variables (right). In each phase, iterative cycles comprising the depicted experimentation and modeling steps are performed (left).To showcase the feasibility of this approach to HAS, the study was systematically conducted in three distinct phases by progressively increasing the model complexity. In Phase 1, the catalyst composition was varied with the objective of maximizing STYHA at fixed reaction conditions. In Phase 2, the dimensionality of the problem was increased by concurrently exploring the catalyst compositions and reaction conditions to maximize STYHA. This approach was subsequently extended towards multi-objective capabilities by simultaneously maximizing STYHA while minimizing combined selectivity to carbon dioxide and methane (\({S}_{{{{{{{\rm{CO}}}}}}}_{2}+{{{{{\rm{CH}}}}}}_{4}}\)) in Phase 3. Iterative cycles comprising of six experiments were conducted during each phase, until the target performance metric(s) were achieved or reached saturation.Phase 1: Optimal catalyst formulations for productivityThe first phase aimed to explore the suitability of the active learning framework in identifying optimal FeCoCuZr formulations for maximizing STYHA under fixed reaction conditions, specifically the H2:CO ratio (H2:CO), reaction temperature (T), pressure (P), and the gas hourly space velocity (GHSV). This strategy enabled the exploration of a space containing >175,000 unique compositional possibilities, known as the chemical space34,36 (Supplementary Note 1), and helped understand the sensitivity of higher alcohols productivity to catalyst composition in this family of materials. Without a priori composition and performance data for FeCoCuZr formulations, the initial model training was performed using 31 data points on the FeCoZr, FeCuZr, and CuCoZr catalysts recently reported by our group45, denoted as seed experiments for Phase 1 (Supplementary Note 2, Supplementary Table 1). Reaction conditions were fixed at H2:CO = 2.0, T = 533 K, P = 50 bar, and GHSV = 24,000 cm3 h−1 gcat−1 across cycles to match those in the seed dataset (Supplementary Note 2).In each cycle, the GP-BO model was trained using the molar content values of the four elements (Fe, Co, Cu, Zr) and the corresponding STYHA of all catalysts in the dataset. Subsequently, we evaluated the expected improvement (EI) and predictive variance (PV) acquisition functions separately under specific constraints to generate candidate compositions (see Methods section, “Gaussian process and Bayesian optimization”, Supplementary Table 2). Six suitable catalysts were manually selected for experimentation by balancing the number of recommendations from EI, which searches for compositions maximizing the STYHA objective (i.e., exploitation), and from PV, which seeks potential candidates in the unexplored chemical space (i.e., exploration)23,48. Here, it is important to acknowledge the key role of human decision-making in providing a judgement and selection from the suggested compositions, which allowed us to supervise and fine-tune the implementation of active learning at this early stage. The experimentally evaluated performance together with measured compositions of these six catalysts were added to the dataset to re-train the model for the next cycle.Five iterative cycles were performed (30 catalysts, see Supplementary Tables 3-7), which mapped the chemical space and identified regions with high STYHA (Fig. 2a). Progressive improvements in STYHA across cycles were achieved (Fig. 2b, Supplementary Tables 3-7), with the Fe69Co12Cu10Zr9 catalyst in Cycle 3 attaining the highest STYHA = 0.39 gHA h−1 gcat−1, a 1.2-fold improvement over the Fe79Co10Zr11 seed benchmark (STYHA = 0.32 gHA h−1 gcat−1). This performance was retained for at least 100 h on stream with no visible sign of deactivation (Supplementary Fig. 1). Similar compositions and performances obtained by the best catalysts in Cycles 4 and 5 confirmed the convergence of results.Fig. 2: Exploration of catalyst compositions to maximize STYHA at fixed reaction conditions.a Performance plot based on the chemical space of FeCoCuZr catalysts, with the sum of Fe, Co, and Cu molar contents normalized to 100% and excluding the Zr content. Compositions of catalysts evaluated during five active learning cycles are depicted as circles, and contours are generated from measured STYHA values of selected catalysts containing 10 ± 5 mol% Zr. b Evolution of the performance obtained from the best catalyst in each cycle and its associated composition (units in mol%). The catalysts are benchmarked vs. the reference Fe79Co10Zr11 catalyst from seed experiments. c Performance-based clustering of the catalysts. The catalysts with the highest STYHA in each cycle are colored in a–c. d Violin plots of catalyst compositions for the clusters identified in c, with individual data points shown as gray dots. Source data are provided in the source data file.To reveal compositional trends driving performance, we used the k-means clustering algorithm21,34. This allowed us to identify catalysts that had high STYHA and SHA – both being key performance metrics, and enabled informed decision making regarding experiment selection in subsequent phases. Four distinct clusters were observed (Fig. 2c, d). Catalysts in the Zr-rich and equimolar clusters exhibit low STYHA, likely due to low contents and suboptimal ratios of the active metals, respectively. Fe stands out as a key active metal, as maximum STYHA up to 0.39 gHA h−1 gcat−1 was attained by Fe-rich catalysts, while those in the Fe-Co rich cluster exhibit the highest SHA up to 17%. Notably, Zr content converged towards 10% in the highest performing Fe-rich catalysts, mimicking that obtained over the seed catalysts45. Irrespective of compositional differences, product distributions of best performing catalysts in each cycle largely resembled each other, with SHA = 14 ± 2%, \({S}_{{{{{{\rm{CO}}}}}}_{2}}\) = 13 ± 5%, \({S}_{{{{{{\rm{CH}}}}}}_{4}}\) = 25 ± 5%, and \(({S}_{{{{{{\rm{CH-}}}}}}}+{S}_{{{{{{\rm{CH=}}}}}}})\) = 48 ± 4% (Supplementary Fig. 2, Supplementary Table 8).The superior STYHA of the Fe69Co12Cu10Zr9 optimal catalyst compared to the Fe79Co10Zr11 seed benchmark stemmed from an increase in XCO (45% vs. 36%), since they showed similar SHA ∼ 13%. We thus conducted characterization studies to identify key features behind the activity gain. STEM-EDX maps of both catalysts (Fig. 3a, b, Supplementary Figs. 3, 4) identified small domains of ZrO2 in intimate contact with larger active metal nanoparticles. These defect-rich ZrO2 domains, as indicated by the presence of Zrδ+ contributions in the Zr 3d region of XPS spectra (Fig. 3c), were previously identified to enhance surface iron and cobalt carbide formation in line with the C 1s signal of XPS spectra (Fig. 3d)49. Carbides are known active phases in CO hydrogenation, and their interface with partially reduced iron oxide species is thought to enhance non-dissociative CO activation and higher alcohol formation in iron-rich catalysts45,49,50. Notably, elemental distributions of Fe and Co suggest that these two metals are inter-dispersed and not segregated in both catalysts whether as calcined or after use (Fig. 3a, b, Supplementary Figs. 3, 4), in line with their tendency to form intermetallic alloys or mixed oxides. Initially dispersed Cu in the fresh Fe69Co12Cu10Zr9 catalyst agglomerated into distinct 10–20 nm-sized agglomerates in contact with Fe-Co nanoparticles following reduction and reaction (Fig. 3b, Supplementary Figs. 3, 4), accompanied by a decrease in surface area (Supplementary Table 9) and confirmed by XRD patterns (Supplementary Fig. 5). Such architectural features have also been observed for similar FeCoCu materials prepared by a sol-gel method without Zr present in the composition51.Fig. 3: Characterization of the best performing catalyst in Phase 1 and the reference seed catalyst.STEM-EDX elemental maps after reaction of (a) the reference Fe79Co10Zr11 and (b) the best performing Fe69Co12Cu10Zr9 catalyst identified in Phase 1. Scale bars represent 50 nm. XPS spectra around (c) Zr 3d and (d) C 1s regions after reaction with deconvoluting signals indicated. e H2-TPR profiles with deconvoluted peaks corresponding to the reduction of oxides indicated. For iron oxide: Fe2O3 → Fe3O4 → FeO → Fe. Temperature-programmed H2-D2 exchange profiles of (f) Fe79Co10Zr11, (g) Fe69Co12Cu10Zr9 with the exchange temperature (at the maximum rate of HD formation) marked. Reaction conditions: H2:CO = 2.0, T = 533 K, P = 50 bar, and GHSV = 24,000 cm3 h−1 gcat−1. Source data are provided in the source data file.The presence of copper enhances the surface reducibility of Fe69Co12Cu10Zr9 as demonstrated by H2-TPR (Fig. 3e), where oxidic Cu is readily reduced in the presence of H2 to the metallic state, which in turn enhances hydrogen splitting and spillover to neighboring Fe-Co oxide domains51. Temperature-programmed H2-D2 exchange experiments also showed a 40 K decrease in the exchange temperature for Fe69Co12Cu10Zr9 (Fig. 3f, g) compared to Fe79Co10Zr11, suggesting that Cu nanoparticles improve hydrogen activation and thus play a similar role as in other Cu-catalyzed reactions such as methanol or olefin synthesis from COx51,52. As such, the increase in XCO, formation rates of each alcohol (Supplementary Table 10) and therefore STYHA attainable over iron-rich FeCoCuZr catalysts originates from its enhanced H2 activation ability from copper, while retaining the characteristics imparted by well-mixed, carbide-rich Fe-Co phases promoted by dispersed and defective ZrO2. Under reaction conditions, *CHx, CH3O*, and CH3CH2O* intermediates were detected by in situ DRIFTS (Supplementary Fig. 6), in line with the expected mechanisms of *CHx coupling and non-dissociative *CO insertion for HAS over m-FTS catalysts.It is worth noting that the model formulation did not include any explicit chemical information guiding the iterative cycle. Nevertheless, it was able to provide performance predictions with high accuracy. The overall performance was influenced by the balance between EI and PV functions used in the GP-BO algorithm23,24,48, as the latter exhibits higher uncertainty leading to lower accuracy, and vice versa for the former. A total of 13 and 17 catalysts were evaluated based on recommendations from the EI and PV acquisition functions, respectively (Supplementary Fig. 7, Supplementary Table 11), progressively improving model performance from an initial coefficient of determination R2 = 0.36 in Cycle 1 to R2 = 0.84 by the final cycle. This improvement resulted from the expansion of available data generated during the active learning cycles and an increased number of experimental evaluations guided by the exploitation function probing regions of high performance. Accurate predictions on catalyst performance could be made owing to the standardized synthesis method that ensured consistency in structural properties.Phase 2: Optimal catalyst formulations and reaction conditions for productivityWhile basic reactivity patterns and the relevance of operating conditions are well known for HAS, there are no universally applicable set of optimal conditions as the influence of each parameter is catalyst-specific. The second phase of this study tackles this by expanding the exploration space to include reaction conditions, including H2:CO, T, and GHSV, defined as the parametric space (Supplementary Note 1). As Phase 1 did not include variation of reaction conditions, 20 additional experiments were performed to broaden the range of reaction conditions initially covered by the model, denoted as seed experiments for Phase 2 (Supplementary Note 3, Supplementary Table 12). Phase 2 was initiated by training the GP-BO algorithm using these 50 data points under compositional and parametric constraints based on knowledge from previous experiments and the literature (Supplementary Note 4, Supplementary Table 13).Reaction conditions were found to exert a significant impact on STYHA, as Fe61Co20Cu9Zr10 identified as the best performer in Cycle 1 was already able to exceed 0.5 gHA h−1 gcat−1 at H2:CO = 1.8, T = 552 K, P = 50 bar, and GHSV = 32,550 cm3 h−1 gcat−1, almost 1.5-fold higher compared to the seed data used in Phase 1. Over the next two active learning cycles, STYHA reached ~0.7 gHA h−1 gcat−1, nearly doubling productivity compared to the maximum achieved in Phase 1 (Fig. 4a, Supplementary Tables 14–16). We noticed that by the end of Cycle 3, the optimizer was locally constrained at the GHSV upper bound of 50,000 cm3 h−1 gcat−1. As it is typically observed in literature that an increase in GHSV concurrently increases STYHA owing to higher reactant flows despite a slight reduction in XCO5,7,9, the upper bound was set to 100,000 cm3 h−1 gcat−1 in Cycle 4 to observe the behavior of the model. This adjustment led to the GP-BO extrapolating to previously unexplored GHSV values, suggesting catalytic systems that provided higher STYHA of up to 0.9 gHA h−1 gcat−1. By Cycle 5, our framework recommended the highly active Fe65Co19Cu5Zr11 catalyst that attained STYHA = 1.1 gHA h−1 gcat−1 at operating conditions of H2:CO = 2.2, T = 551 K, P = 50 bar, and GHSV = 90,000 cm3 h−1 gcat−1, marking a significant 3.5-fold increase from the original Phase 1 seed benchmark (Fig. 4a, Supplementary Table 18). The stability of this catalyst was evaluated in a 150-h catalytic run (Fig. 4b), where XCO ≥ 40% and STYHA ≥ 1 gHA h−1 gcat−1 were maintained throughout. Phase 2 concluded upon the completion of Cycle 6, during which catalytic systems yielding approximately 1 gHA h−1 gcat−1 were achieved once more, suggesting repeatability of results as well as model saturation (Fig. 4a, Supplementary Table 19).Fig. 4: Identification of catalyst compositions and reaction conditions maximizing STYHA.a Evolution of the performance obtained from the best catalyst in each cycle and its respective composition and reaction conditions. All reactions were conducted at P = 50 bar. b Stability test of the most productive Fe65Co19Cu5Zr11 catalyst over 150 h on stream. c Comparison of the performance of best FeCoCuZr catalytic systems developed in each active learning cycle across Phase 1 and 2, with over 120 Rh-, Mo-, and m-FTS based catalysts reported in literature. The horizontal dashed line represents average literature reported STYHA values. d Parity plots depicting the model performance through the active learning phases. Source data are provided in the source data file.Benchmarking FeCoCuZr catalytic systems from Phases 1 and 2 with literature-reported catalysts across various families such as Rh-based, Mo-based, and m-FTS-based revealed notable differences. The literature-reported catalysts (a total of 125 catalysts were examined) exhibited an average STYHA ≈ 0.1 gHA h−1 gcat−1 with top performers in the 90th percentile reaching 0.18 gHA h−1 gcat−1. Conversely, the best performing FeCoCuZr catalysts in the different cycles of Phases 1 and 2 had an average STYHA of approximately 0.6 gHA h−1 gcat−1 highlighting significantly enhanced productivity compared to literature-reported counterparts for direct hydrocarbon synthesis from syngas for direct HAS from syngas6,8 (Fig. 4c). For comparative analysis, selectivity to higher alcohols exhibited a less responsive nature than productivity in this family. Regardless of the diverse set of compositions and reaction conditions investigated, SHA = 11 ± 2% was the mean of the most active systems in Phase 2 (Supplementary Fig. 8, Supplementary Table 20), similar to the observation made in Phase 1 and pointing to an intrinsic feature of the FeCoCuZr family.The impact of varying reaction conditions on performance in Phase 2 could be visualized by mapping catalyst compositions and their respective STYHA attained in four clusters dictated by GHSV and T ranges (Supplementary Fig. 9). The highest-performing catalytic systems share similar compositions to the iron-rich catalysts discovered to be optimal in Phase 1 (Fig. 2c, d), hinting that they retain the same catalytic features previously determined to boost activity. The model recommendations of maximizing GHSV, T = 550–570 K, and moderate H2:CO ∼ 2 are also in line with established heuristics for syngas-based HAS (Supplementary Note 4). The accuracy remained high with R2 = 0.78 in Phase 2, comparable to that in Phase 1. Progressive improvements in accuracy were evident in Phase 2, with the mean absolute percentage error (MAPE) between predicted and measured STYHA for each cycle in Phase 2 decreasing from 33% in Cycle 1 to 7.6% in Cycle 6 (Supplementary Table 21). Considering predicted and measured STYHA across both phases resulted in an overall performance accuracy of R2 = 0.91 (Fig. 4d) with a low root mean squared error (RMSE) of 0.09 g h−1 gcat−1.Phase 3: Maximized productivity and minimized selectivity to by-productsThe third phase of this study aimed to apply active learning to search for catalytic systems that could meet multiple performance criteria, better reflecting the real-world demands on catalysts. Given the modest SHA across all catalysts developed in Phase 1 and 2, we focused on selectivities towards carbon dioxide \(({S}_{{{{{{{\rm{CO}}}}}}}_{2}})\) and methane \(({S}_{{{{{{{\rm{CH}}}}}}}_{4}})\), considered as the least valuable products in HAS5,9. The 86 data points in Phase 1 and 2 exhibited \({S}_{{{{{{{\rm{CO}}}}}}}_{2}}\)= 16 ± 6% and \({S}_{{{{{{{\rm{CH}}}}}}}_{4}}\) = 25 ± 4%, highlighting the significance of the water-gas shift (WGS) and CO methanation reactions, especially at conditions favoring high XCO and therefore STYHA. (Supplementary Note 4, Supplementary Fig. 10). A plot of STYHA vs. \({S}_{{{{{{{\rm{CO}}}}}}}_{2}+{{{{{\rm{CH}}}}}}_{4}}\) suggested an intrinsic trade-off in the form of a Pareto front (Fig. 5a), in which the improvement of one metric would likely be at the expense of the other53,54.Fig. 5: Uncovering Pareto-optimal catalysts and performance drivers.a Visual depiction of Pareto-optimal catalytic performances before and after multi-objective active learning. The Pareto front delimits the zone available by catalysts developed in Phase 1 and 2. b Product distribution for FeCoCuZr catalysts across Phase 3. c Relative feature-importance of catalyst compositions and reaction conditions on each of the performance metrics as determined by SHAP analysis. Source data are provided in the source data file.This scenario was explored in Phase 3 by varying the catalyst compositions and reaction conditions simultaneously to maximize STYHA and minimize \({S}_{{{{{{{\rm{CO}}}}}}}_{2}+{{{{{\rm{CH}}}}}}_{4}}\). For this purpose, the GP-BO algorithm was trained with data from Phases 1 and 2 with STYHA and \({S}_{{{{{{{\rm{CO}}}}}}}_{2}+{{{{{\rm{CH}}}}}}_{4}}\) as target metrics, using the Expected Hypervolume Improvement (EHVI) acquisition function, which guides the optimization process to recommend catalyst composition and reaction conditions that are likely to lead to better trade-offs among conflicting objectives23,35.During Cycle 1, a significant discrepancy between predicted and measured values of STYHA and \({S}_{{{{{{{\rm{CO}}}}}}}_{2}+{{{{{\rm{CH}}}}}}_{4}}\) was observed (Supplementary Fig. 11, Supplementary Table 26) with none of the catalysts evaluated near the Pareto barrier (Fig. 5a, Supplementary Table 22). Upon entering Cycle 2, two of the six catalysts evaluated were situated on the Pareto front (Fig. 5a, Supplementary Table 23), while the model exhibited enhanced prediction accuracy. The highest performing system notably attained STYHA = 1.04 gHA h−1 gcat−1, only 5% lower than the maximum attained in Phase 2 but with a drastically reduced \({S}_{{{{{{{\rm{CO}}}}}}}_{2}+{{{{{\rm{CH}}}}}}_{4}}\) = 40% vs. 46%. By Cycle 3, model recommendations improved significantly as five of the six catalysts evaluated lie directly on the Pareto frontier without crossing it (Fig. 5a, Supplementary Table 24) attaining \({S}_{{{{{{{\rm{CO}}}}}}}_{2}+{{{{{\rm{CH}}}}}}_{4}}\) = 34 ± 2% and STYHA = 0.65 ± 0.05 gHA h−1 gcat−1. Herein, while the productivity remained ca. two times higher than average literature values, we highlight the selectivity of undesired CO2 and CH4 was minimized by around 10% (Fig. 5b), in comparison to some of the catalysts developed in Phase 2, suggesting an optimal trade-off between STYHA and \({S}_{{{{{{{\rm{CO}}}}}}}_{2}+{{{{{\rm{CH}}}}}}_{4}}\). Notably, we identified optimal systems along the Pareto frontier, suggesting an intrinsic limitation of this family of HAS catalysts to achieve \({S}_{{{{{{{\rm{CO}}}}}}}_{2}+{{{{{\rm{CH}}}}}}_{4}}\) < 30% without compromising STYHA. However, within this constraint, our strategy eventually uncovered five Pareto-optimal catalytic systems which are otherwise non-intuitive and not easily accessed by human experts55, thereby underscoring its versatility and significance.Performance drivers and data-informed guidelinesWe sought to elucidate the main performance drivers among the set of input features impacting STYHA and \({S}_{{{{{{{\rm{CO}}}}}}}_{2}+{{{{{\rm{CH}}}}}}_{4}}\). However, an inherent challenge with most ML algorithms, including the GP regressor used in this study, lies in the complexity of deciphering the internal rationale behind predictions—rendering them black-box in nature. To address this challenge and make the model interpretable, we utilized the agnostic ML explainer, SHapley Additive exPlanations (SHAP)56. This methodology facilitates the extraction of interpretable insights from the GP algorithm through the computation of feature-importance scores57,58. Akin to sensitivity analysis, SHAP determines the individual or combined contributions of features to the model’s prediction, enabling catalysis practitioners to quantify the relative importance of different features affecting performance, that can be corroborated with existing knowledge or lead to testable hypotheses.The overall influence of each feature was expressed by normalized SHAP values, revealing that reaction conditions and catalyst compositions contributed to ca. 60% and 40%, respectively, to the model predictions for both targeted metrics in Phases 2 and 3 (Fig. 5c). In the case of STYHA, GHSV and T emerged as the two most important parameters, in alignment with earlier intuition in Phase 2 as well as findings in the literature for similar C1 transformations36,57. Fe content ranked as the most prominent compositional input, in line with the high productivities only attained by iron-rich catalysts, as highlighted in the discussion of Phase 1 and given the claimed role of Fe phases in C-C coupling6. For \({S}_{{{{{{{\rm{CO}}}}}}}_{2}+{{{{{\rm{CH}}}}}}_{4}}\), T was identified as the most significant variable, followed by Fe and Co contents. The role of T in dictating selectivity patterns could be ascribed to the coexistence of competing reaction networks, each with different temperature-dependent kinetic and thermodynamic barriers including HAS, WGS, or methanation. This highlights the importance of optimizing T to fine-tune the selectivity towards higher alcohols or by-products. The Fe-Co-rich cluster determined in Phase 1 (Fig. 2c) catered most favorably to HA selectivity, with Fe-Co surface carbides previously identified as a key feature for selective higher alcohol production from catalyst characterization45.Despite demonstrating the efficacy of active learning in uncovering catalytic systems that enhance multiple performance metrics for HAS, it is essential to acknowledge its scope and limitations in its current form. The lack of electronic or structural descriptors as inputs to the model and its inability to optimize performance metrics which are intrinsically unresponsive to screened variables, such as SHA in this work, can be mentioned (see Supplementary Note 5 for extended discussion). Nonetheless, in the course of this study, three categories of catalytic systems within the FeCoCuZr family emerged exhibiting distinct performance characteristics, namely high STYHA and \({S}_{{{{{{{\rm{CO}}}}}}}_{2}+{{{{{\rm{CH}}}}}}_{4}}\) (STYHA = 0.97 ± 0.08 gHA h−1 gcat−1, \({S}_{{{{{{{\rm{CO}}}}}}}_{2}+{{{{{\rm{CH}}}}}}_{4}}\) = 44 ± 2%, SHA = 10 ± 1%), low STYHA and \({S}_{{{{{{{\rm{CO}}}}}}}_{2}+{{{{{\rm{CH}}}}}}_{4}}\) (STYHA = 0.25 ± 0.07 gHA h−1 gcat−1, \({S}_{{{{{{{\rm{CO}}}}}}}_{2}+{{{{{\rm{CH}}}}}}_{4}}\) = 31 ± 3%, SHA = 14 ± 2%), and Pareto-optimal catalysts (STYHA = 0.63 ± 0.06 gHA h−1 gcat−1, \({S}_{{{{{{{\rm{CO}}}}}}}_{2}+{{{{{\rm{CH}}}}}}_{4}}\) = 34 ± 2%, SHA = 14.6 ± 0.3%) (Fig. 6). Each category favors unique catalyst compositions and reaction conditions; for instance, systems displaying high STYHA and \({S}_{{{{{{{\rm{CO}}}}}}}_{2}+{{{{{\rm{CH}}}}}}_{4}}\) are characterized by high molar Fe content, H2:CO, and GHSV values, whereas the low STYHA and \({S}_{{{{{{{\rm{CO}}}}}}}_{2}+{{{{{\rm{CH}}}}}}_{4}}\) counterparts are favored at equimolar Fe-Co contents, low H2:CO, and milder T. The Pareto-optimal catalysts feature a combination of the aforementioned traits, recommending high Fe contents and operation at high GHSV and mild T. These quantitative guidelines, especially those relating to operating conditions, align with literature findings and are likely not dependent on specific catalyst formulations and could be relevant to HAS catalysts in general6,9. However, the exact compositional guidelines provided herein apply to the FeCoCuZr catalysts investigated in this study and would arguably not be directly relevant in designing HAS systems with different active metals, promoters, and architectures prepared by different synthesis methods. Importantly, this methodology based on data analysis can be extended to other potential HAS systems or even other multi-product chemical transformations, provided sufficient experimental data is available. Other users are thus recommended to formulate specific guidelines for their application during the active learning process. Overall, in the absence of quantifiable techno-economic data and community consensus on practically relevant productivities or selectivities for HAS, this approach provides guidelines for optimizing key metrics, serving as valuable assets to catalysis practitioners and industry stakeholders to accelerate research efforts by assisting in the selection of appropriate catalytic systems and experiments, ultimately saving time and resources.Fig. 6: Establishment of guidelines for developing performance-specific catalysts.Data-informed guidelines aimed towards the development of FeCoCuZr catalytic systems belonging to three performance categories. The box plots present a statistical summary of the composition and reaction condition requirements for each category, showing the minimum, interquartile range, and maximum values. Outliers are indicated as points outside the boundary of the whiskers wherever applicable. The corresponding performance metrics are shown below as horizontal bars. Source data are provided in the source data file.Active learning and sustainable laboratoriesWhile the possible chemical and parametric space of the FeCoCuZr systems is in the order of billion combinations, practical and real-world studies on multicomponent catalysts range between hundreds to thousands screening experiments35,36,38. By employing active learning we mapped the vast space of FeCoCuZr catalysts to a cumulative 104 experiments, across Phases 1–3 to meet the desired performance objectives, confirming the growing body of literature that claim active learning accelerates experimental efforts34,35,36. This has a profound impact on the environmental and economic sustainability of catalyst development programs that has not been explored.To this context, assuming this study as representative of a catalyst development endeavor, we assessed the degree to which active learning could impact both sustainability pillars in laboratories (see scope in Supplementary Note 6). Our analysis suggests average reductions exceeding 90% in carbon footprint and costs on benchmarking with traditional campaigns (Fig. 7, Supplementary Tables 27–29). We also observe a very mild dependency of this result with regional variations across the globe affecting, for example, composition of the energy mix or laboratory operational expenditure (Supplementary Fig. 12). Thus, by reducing consumption of chemicals and energy, and optimizing resource utilization, active learning remarkably fosters sustainable catalysis laboratories.Fig. 7: Impact of active learning on laboratory sustainability.Schematic illustration of the reduction in work-days, CO2 footprint, and operating costs on a global average basis fostered by the adoption of active learning framework over traditional catalyst development programs, calculated from results obtained in this study. The spirals represent experimental efforts, and the derivation of the sustainability metrics shown are detailed in Supplementary Note 6 and regional variations in Supplementary Fig. 12.

Active learning streamlines development of high performance catalysts for higher alcohol synthesis

Zero-shot transfer of protein sequence likelihood models to thermostability prediction

Poisoning medical knowledge using large language models

Semi-supervised recognition for artificial intelligence assisted pathology image diagnosis

Chemistry wordoku #062 | Puzzle

Multi-output prediction of dose–response curves enables drug repositioning and biomarker discovery

Hot Topics

Zero-shot transfer of protein sequence likelihood models to thermostability prediction

Poisoning medical knowledge using large language models

Semi-supervised recognition for artificial intelligence assisted pathology image diagnosis

Related Articles

Balancing Act: Pregnancy and Bipolar Disorder

Cohesion at the cellular level: flexible yet stable

Gut bacteria influence responses to immunotherapy in patients with asbestos related cancer

Quick Links

Must Read

Zero-shot transfer of protein sequence likelihood models to thermostability prediction

Poisoning medical knowledge using large language models

Semi-supervised recognition for artificial intelligence assisted pathology image diagnosis

Chemistry wordoku #062 | Puzzle

Popular Articles

Zero-shot transfer of protein sequence likelihood models to thermostability prediction

Poisoning medical knowledge using large language models

Semi-supervised recognition for artificial intelligence assisted pathology image diagnosis