Preventive machine learning models incorporating health checkup data and hair mineral analysis for low bone mass identification

Study design and participantsThis was a single-center, retrospective cross-sectional study including community-dwelling postmenopausal women and men aged 50 years and older who participated in health checkups between 2008 and 2022 at a health promotion center, Bundang CHA Medical Center, CHA University in South Korea. The exclusion criteria included participants with missing values in their blood samples and those who had not completed the questionnaire. A total of 2061 participants were enrolled in this study. After applying the exclusion criteria, 2026 subjects were included in the analysis. Hair mineral analysis, an option within our medical center’s health screening program, which enables the general public to assess their mineral levels and exposure to heavy metals. Figure 1 shows the flowchart of participant selection. This study was approved by the Institutional Review Board (IRB protocol no. 2023-08-038), and informed consent was waived because of the retrospective nature of the study.Figure 1Study participants and the machine learning models. BMD bone mineral density.Biochemical and anthropometric measurementsA baseline physical examination was conducted, which included measurements of vital signs and assessments of body height, weight, and waist circumference. Blood samples were taken after an overnight 8 h of fasting. Blood tests included complete blood count, biochemical tests, metabolic components, endocrine hormones, tumor markers, and bone turnover markers, which are frequently performed during health checkups. Detailed features are presented in Table 1. BMD was measured using DXA (Hologic QDR-4500, Bedford, MA, USA) at the lumbar spine (L1-4), unilateral femoral neck, and total femur. The T-score represented the standard deviation of BMD from healthy young adults of the same sex and ethnicity. The T-scores at each site were obtained, and the lowest T-score was used to interpret the results. Osteoporosis was defined as T-score ≤  − 2.5, osteopenia as − 2.5 < T-score <  − 1, and normal as T-score ≥  − 120. The osteopenia and osteoporosis groups were classified as the LBM group. The normal and LBM groups were used to train the binary classification model.
Table 1 Input features used for machine learning (ML) algorithms.Demographic and lifestyle parametersAll participants were requested to complete the questionnaire on the day of their health checkup. The questionnaire encompassed various aspects of the participants’ sociodemographic characteristics, past medical history, smoking status, alcohol consumption status, and physical activity levels. Regarding alcohol consumption, individuals consuming more than 14 standard drinks were categorized as the excessive alcohol consumption group21. Regardless of the duration, smoking status was classified as current smoker, ex-smoker, or non-smoker. Regular physical activity was defined as moderate-intensity physical activity of at least 150–300 min/week22. Participants were classified as having hypertension, diabetes mellitus, and dyslipidemia if they were under medication.Hair mineral analysisHair samples were obtained using stainless steel sampling scissors at four different points of the occipital scalp. All participants were asked not to use a chemical process on their hair for at least 8 weeks before sample collection. The hair sample was placed directly into a clean specimen envelope and sent to USA Trace Elements Inc. (TEI, Dallas, TX, USA). Hair mineral analysis revealed the concentrations of nutritional, additional, and toxic elements. A total of 22 elements were used for the analysis (Supplementary Table S1). All mineral levels were reported in milligrams percent. One milligram percent (mg%) equals ten parts per million (ppm).Machine learning model development and input featuresIn this study, the four ensemble ML algorithms were used to analyze the data: Random Forest (RF), Extreme Gradient Boosting (XGB), Gradient Boosting (GB), and Adaptive Boosting (AdaBoost)23,24,25. All models were performed using Scikit-learn in Python 3.11 (Python Software Foundation, Wilmington, DE, USA). The ML models were trained with a prediction of LBM, which included the osteopenia and osteoporosis groups. The prediction target of each model was a binary variable, where “1” represented the LBM groups, and “0” represented the normal group.The health checkup results mentioned above were divided into five categories: demographics, anthropometric measurements, lifestyle, medical history, and blood tests. The checkup results and hair mineral analysis were utilized as features in machine learning algorithms. The Features were randomly divided into training and testing datasets with an 80:20 ratio: 1620 in the training dataset and 406 in the test dataset. The feature weights were not applied to the models.The average area under the receiver operating characteristic curve (AUROC) was processed through 50 repetitions of fivefold cross-validation to train and validate the four ML algorithms. The receiver operating characteristic (ROC) curve is a plot of true positive rate (sensitivity) on the y axis against false positive rate (1-specificity) on the x axis. The AUROC is a summary measure that essentially averages diagnostic accuracy across the spectrum of test values. AUROC equals 0.5 when the ROC curve corresponds to random chance and 1.0 for perfect accuracy26. From the confusion matrix, accuracy, sensitivity, specificity, precision (positive predictive value), negative predictive value, and an F1 score were calculated and summarized as the performance metrics. A grid search was performed to optimize the hyperparameters. (Supplementary Table S2). In the data preprocessing steps, missing values were excluded (Fig. 1), categorical variables were encoded using one-hot encoding to binarization, and numeric variables were standardized with StandardScaler in Python to maintain their original ranges without distortion27.The Shapley additive explanation (SHAP) values were used to compare the effects of the features to detect the important parameters. The SHAP values, rooted in cooperative game theory for fair profit allocation, are adapted to explain individual feature contributions in ML models28,29. They can be visualized using summary plots, which display the average magnitude and direction of each feature’s impact on predictions and provide a useful overview of the model.Statistical analysisContinuous variables were presented as means ± standard deviation and medians with interquartile range, and categorical variables were presented as numbers (percentages). Comparisons of variables between the two groups were performed using the chi-square test or Fisher’s exact test for categorical variables and the independent t-test for continuous variables. Hair mineral concentrations were compared between the two groups using the Mann–Whitney U test. All statistical analyses were performed using the SPSS statistical package, version 27.0 (IBM corporation, Armonk, NY, USA), and p-values < 0.05 were considered statistically significant.Informed consentInformed consent was waived because of the retrospective nature of the study and the analysis used anonymous clinical data.Institutional review board statementThe study was conducted in accordance with the Declaration of Helsinki, and approved by the Institutional Review Board of CHA Bundang Medical Center (IRB protocol no. 2023-08-038).

Hot Topics

Related Articles