Identification of metabolic biomarkers in idiopathic pulmonary arterial hypertension using targeted metabolomics and bioinformatics analysis

Recent research has increasingly highlighted the role of metabolomics in understanding PAH. Metabolomics provides a unique vantage point to examine the perturbations in small molecule metabolites that result from or contribute to the PAH pathophysiology. For instance, studies have identified alterations in lipid metabolism, glycolysis, and amino acid pathways in PAH patients, indicating a systemic shift in energy utilization and biosynthesis13. Another study highlighted the role of disrupted nitric oxide pathways and suggested potential metabolic signatures that could be linked to the severity and progression of PAH14. A combination of liquid and gas chromatography-based mass spectrometry was employed to ascertain that patients with severe pulmonary arterial hypertension (PAH) exhibited disrupted glycolysis, an increased tricarboxylic acid (TCA) cycle, and altered fatty acid metabolism with changes in oxidation pathways15.Nevertheless, the construction of a clinically useful predictor based on such a large number of features is not a viable option. Furthermore, given the reduced predictive power of a single metabolite and its variation across cohorts16,17, our objective was to construct a robust predictor for PAH diagnosis with a small number of metabolites by leveraging the metabolites associated with diagnosis across our cohort. The application of machine learning algorithms in metabolomics is becoming increasingly important in the interpretation of complex biological datasets. Machine learning offers robust analytical capabilities that can handle the vast and complex datasets typical of metabolomic studies, enabling researchers to uncover patterns and associations that may be missed through traditional statistical methods18,19. In this study, the machine learning algorithms LASSO, SVM, and RF were employed in a joint feature selection process for disease diagnostic factors. The selection of these statistical approaches was based on their complementary features. In particular, LASSO is effective for the selection of features when the number of predictors is significantly larger than the size of the sample20. The LASSO algorithm is also employed for categorical variables, thereby enhancing the predictive accuracy and interpretability of statistical models21. Random forest and SVM are robust machine learning techniques that can effectively process high-dimensional data and have been previously utilized in metabolomics22. In our study, the use of algorithms like Random Forest and Support Vector Machine allowed for effective differentiation between control and IPAH patients, highlighting key metabolites that serve as potential biomarkers. This approach not only improves the accuracy of biomarker identification but also enhances our understanding of the underlying molecular mechanisms of diseases. The predictive modeling capabilities of machine learning can further assist in the clinical translation of metabolomic findings, providing a foundation for personalized medicine approaches for treating IPAH.The arginine pathway was most prominent in our cohort, which is consistent with the findings of other relevant studies. Arginine serves as the substrate for the synthesis of nitric oxide (NO), a crucial mediator of vascular homeostasis and vasodilation23. Patients afflicted with pulmonary arterial hypertension (PAH), as well as other forms of pulmonary hypertension, exhibit reduced arginine bioavailability in comparison to healthy controls. One study demonstrated that in pulmonary hypertension, alterations in the arginine metabolic pathway are evident, particularly a strong inverse correlation between the ratio of arginine to ornithine and citrulline and key pulmonary hemodynamic indicators, indicating significant changes in arginine bioavailability24. Altered arginine metabolism, particularly through increased arginase activity, significantly impacts nitric oxide synthesis in pulmonary arterial hypertension. This highlights distinct metabolic endotypes among patients who could influence therapeutic approaches25. These studies corroborate our findings where significant alterations in the arginine synthesis pathway were observed, emphasizing the shift toward altered polyamine metabolism in IPAH.In this study, a total of 20 differentially expressed metabolites (DEMs) were identified by metabolomics analysis. In this study, we used a combination of VIP values greater than 1 and raw P values less than 0.05 as criteria for identifying differential metabolites. This approach is commonly applied in metabolomics studies to balance model contribution and statistical significance. The VIP (Variable Importance in Projection) value, derived from PLS-DA models, reflects the importance of each metabolite in distinguishing between sample groups. Using VIP values helps ensure that the metabolites selected contribute meaningfully to the model’s ability to differentiate groups, while the raw P value ensures that the observed differences are statistically significant26. The use of VIP values combined with raw P values is a feasible and widely accepted method in metabolomics analysis27,28. It allows the inclusion of biologically relevant metabolites that may not meet the more stringent criteria of adjusted P values, but still play an important role in metabolic pathways. By applying both criteria, we aim to provide a comprehensive view of the metabolic changes while also maintaining statistical rigor.These DEMs were mainly involved in various metabolic pathways, including arginine biosynthesis, histidine metabolism, arginine and proline metabolism, glycine, serine and threonine metabolism. The application of machine learning has identified five metabolites, namely AMP, kynurenine, homoserine, tryptophan, and spermine, which can significantly differentiate patients with pulmonary hypertension from healthy individuals. Adenosine monophosphate (AMP) acts as an intermediary in the energy metabolism of adenosine triphosphate (ATP) and is a critical element of the urea cycle. ATP-activated protein kinase (AMPK) is a highly conserved serine/threonine protein kinase that has a proapoptotic function in invasive smooth muscle cells (SMCs)29. Kynurenine is a direct metabolite of tryptophan, produced through the kynurenine pathway. This pathway is essential for the catabolism of tryptophan and is implicated in the production of several bioactive compounds, especially the immune response30. Gregory et al. identified metabolic signatures of right ventricular-pulmonary vascular dysfunction, revealing that tryptophan metabolites, particularly those produced via the indoleamine 2,3-dioxygenase (IDO) pathway, are closely associated with pulmonary hypertension and could serve as novel biomarkers24. The results contribute to an accumulating body of clinical and preclinical evidence indicating a role for the kynurenine pathway of tryptophan metabolism in the pathogenesis of PAH. Homoserine is an important intermediate in living organisms. As part of the amino acid synthesis pathway, homoserine metabolites play an important role in the regulation of cell proliferation and differentiation31,32. Homoserine plays a role in metabolic syndrome (MetS), a common health problem in which cardiovascular-metabolic risk factors are present. A study of patients with primary MetS and matched controls found significantly lower levels of homoserine and associations with markers of inflammation, blood glucose, blood pressure, and lipocalin in patients with MetS compared to controls33. Yang et al. concludes that elevated plasma spermine promotes pulmonary vascular remodeling in pulmonary arterial hypertension, and targeting spermine synthase may offer a novel therapeutic approach for the disease34.Three hub genes were identified through machine learning, and their expressions were further validated using an additional PAH dataset, demonstrating robust diagnostic value. MAPK6 is a member of the MAPK signaling pathway, which plays an important role in how cells respond to external stimuli. MAPK6 is implicated in a number of cellular processes, including proliferation, differentiation, invasion, and immune response35,36,37. A large body of scientific literature supports the involvement of kynurenine and tryptophan in inflammatory responses and immune regulation38. Both metabolites are key players in the kynurenine pathway. The kynurenine pathway is the primary route of tryptophan catabolism. This pathway is critical for immune regulation, particularly in the control of inflammation. Kynurenine and its derivatives have been shown to have immunosuppressive and anti-inflammatory properties that have implications for a variety of immune-related diseases39. Dysregulation of this pathway, especially under conditions of immune stress, can lead to pathological conditions such as cardiovascular disease, autoimmune syndromes, and neurodegenerative disorders40. In relation to MAPK6, this gene is involved in pathways that regulate cellular responses to inflammation. The interaction between MAPK6 dysregulation and kynurenine or tryptophan metabolism may contribute to the inflammatory processes seen in diseases such as pulmonary arterial hypertension, where both immune regulation and cellular metabolism are important.SLC7A11, also known as xCT, is a cystine/glutamate reverse transporter protein that plays a primary role in the cystine metabolic pathway41. Glutathione is a crucial antioxidant that safeguards cells from oxidative stress by neutralizing reactive oxygen species (ROS) and upholding cellular redox homeostasis. SLC7A11 therefore plays a pivotal role in regulating cellular redox homeostasis and in resisting cellular demise, such as that resulting from iron toxicity42. The SLC7A11 gene is overexpressed in a multitude of human malignancies, exhibiting a correlation with tumor growth, proliferation, dissemination, the tumor microenvironment and resistance to treatment43,44. The metabolite spermine, a polyamine, is known to be involved in the regulation of oxidative stress and polyamine metabolism45. Polyamine metabolism is essential for cell growth and survival under stress conditions. Altered polyamine metabolism, as seen with elevated spermine levels, may exacerbate the effects of SLC7A11 dysregulation. Consequently, it is also regarded as an important target for cancer therapy. CDC42BPA is a serine/threonine protein kinase that binds to the small GTPase CDC42 and is involved in the regulation of cytoskeletal reorganization, cell migration and shape change. CDC42BPA affects cell cycle-related proteins and signaling pathways and has been shown to promote cell proliferation46. Energy metabolism, particularly the use of ATP, is intimately linked to cytoskeletal dynamics, as cytoskeletal rearrangements are energy-intensive processes. Studies indicate that around 50% of cellular ATP is consumed in maintaining the cytoskeleton’s structure and function, suggesting that changes in AMP and ATP levels can directly influence these dynamics​47.The present study employs a multi-faceted approach to elucidate the expression landscape of metabolites and metabolism-related genes in PAH. This approach employs a combination of advanced analytical techniques, including targeted metabolomics differential analysis and multiple machine learning algorithms, to identify potential signature biomarkers. The application of intelligent algorithms to mine variables becomes crucial in light of the substantial amount of metabolomic information present in PAH. In this study, five distinctive DEMs were identified in the serum of PAH patients through the application of bioinformatics techniques, including Lasso, SVM, and RF algorithms. Furthermore, three genes related to metabolism were identified. Subsequently, the diagnostic efficacy of these five metabolites was evaluated by means of ROC curve analysis, and they were finally identified as potential new markers for the diagnosis of PAH. Furthermore, the expression of metabolism-related genes was validated in lung tissue or PBMCs from three human datasets. However, in view of the exploratory nature of this study and the limited number of subjects, we elected to use completely healthy individuals as controls. The decision to use healthy individuals as controls may result in an overestimation of the sensitivity and specificity of the metabolic biomarkers. In future large sample studies, we will select case controls as the control group to authenticate the diagnostic efficacy of these biomarkers.

Hot Topics

Related Articles