Computational algorithm based on health and lifestyle traits to categorize lifemetabotypes in the NUTRiMDEA cohort

Descriptive characteristics of NUTRiMDEA populationParticipants’ phenotypic characteristics in the NUTRiMDEA study were categorized by sex (female and male) and age (< 40 years and ≥ 40 years), whose data are shown in Tables 1, 2 and 3. Regarding the analysis of factorial interactions between sex and age, significant differences were found in education, home situation, number of meals per day and snacking habits, nap habit, weight, BMI, and PCS12. A higher proportion of women and younger participants declared university education, while the younger population reported a higher rate of students and a lower rate of employment. Younger men were more likely to live alone, whereas older individuals reported living as couples and with children. Regarding cardiometabolic diseases, older man informed a higher prevalence. About family history, females reported a higher prevalence of familial high blood pressure (HBP) and dyslipidemia. Younger females stated more depression, while older men informed a lower incidence. Considering lifestyle, younger men reported a higher rate of the smoking habit, whereas younger women tended to have a lower prevalence, but a higher meal frequency and more snacking habits. Younger individuals, regardless of sex, informed replacing food with snacks more often and consumed more water than older participants. Subjects under 40 years also reported napping longer (> 60 min) and slept more hours at night compared to those over 40 years individuals. The data revealed significant differences in physical activity on age and sex. Older individuals stated spending less time sitting, but older women spent more time sitting than older men. Regardless of age, men self-reported engaging in more moderate physical activity than women. Older men informed participating more in light physical activity, while younger men engaged in more intense physical activity. Older women performed less intense physical activity. Men obtained a higher level of total METs-min/week than women, irrespective of age. Older men had a higher BMI, while younger women had a lower BMI. Additionally, woman and older individuals showed a higher score on the MDS14. Regarding HRQoL, younger people had a higher PCS12 score, regardless of sex, while men and older individuals had a better MCS12 score, and young women scored lower on MCS12.Table 1 Sociodemographic and health characteristics of NUTRiMDEA participants distributed by sex and age, n (%).Table 2 Lifestyle characteristics of NUTRiMDEA participants distributed by sex and age, n (%).Table 3 Nutritional, physical activity and HRQoL traits of NUTRiMDEA participants distributed by sex and age, mean (SD).Exploratory factor analysis and selection of variablesAfter the exploratory factor analysis of 62 variables and the scree plot test, 19 factors with an eigenvalue greater than 1 explaining 57.5% of the total variance were obtained. The resulting KMO measure was 0.7578, which is above the rule-of-thumb cut-off for KMO, whose values should generally be above 0.60 for sampling adequacy. Subsequently, the orthogonal rotation procedure (promax rotation), a visual inspection of the correlation matrix was performed, and we considered an absolute factor loading ≥ 0.30 as significant for each factor (Supplementary Fig. S2). The variables related to HRQoL were most relevant in factor (1) Age, sex, cardiometabolic health, some aspects of HRQoL, and smoking habits determined factor (2) Factor 3 comprised questions related to the Mediterranean diet, while smoking habits, and some aspects of HRQoL were included in factor 4. Family medical history and physical activity for factors 5 and 6 were dominant. Living situation was relevant for factor 5 and 7, while prevalence of obesity and smoking were important for factor 8. Factor 9 consisted of sleeping and smoking habits. Factor 10 contained aspects of food. Factor 11 was related to living in couple, sleeping, and snacking habits. For factor 12 live alone, with others and the occupation were most involved variables. Ethnicity and use of olive oil as the main fat in meals were important for factor 13. Sex, diabetes prevalence, consumption of legumes, and physical exercise were contained in factor 14. Factor 15 comprised questions related to the nap habit and feeling calm and quiet in the HRQoL questionnaire. Dyslipidemia and added salt in dishes were significant contributors for factor 16. Living with the elderly and prevalence of diabetes were relevant for factor 17. While living with the elderly and consuming more white meat than red and processed were important for factor 18. Lastly, factor 19 was associated with education, wine consumption, servings of fish and shellfish per week, and consuming more white meat than red and processed meat. After the hierarchical cluster analysis, resulted 5 lifemetabotypes, which are presented descriptively in dendrograms (Fig. 2).Fig. 2Dendrogram. Phenotypic description of cluster analysis. As can be seen on the abscissa axis, the dendrogram graphically represents the number of groups. Within each group, the corresponding population number is shown (n). The colors show the 5 optimal clusters used to characterize into phenotypes.Clustering informationThe clustering technique yielded 5 clusters. After analyzing variables for each cluster, phenotypic characteristics were identified (Table 4 and Supplementary Tables S1 and S2). Cluster 1, labeled ‘Westernized Millennial’, encompassed 967 mainly 18-40-year-old participants, predominantly female, Caucasian, with university education, and either employed or students. Most reported living with others. This cluster stated no significant prevalence of cardiometabolic diseases or family history, but nearly half informed frequent sadness or depression. Respondents of this cluster declared the highest red/processed meat consumption, medium-low Mediterranean diet adherence, and medium physical activity levels. This cluster also had the highest non-smoker percentage, with smokers reporting the fewest cigarettes smoked daily. The PCS12 score was medium, while the MCS12 score was medium-low.Cluster 2 ‘Healthy’ included 10,616 volunteers. This group informed being mainly middle-aged (40–55 years), with a slightly higher proportion of women compared to men, and predominance of Caucasians over Hispanics. Most participants declared having university education, living in couple and with children, being employed, maintaining normal weight, having no cardiometabolic diseases, but a family history of HBP and dyslipidemia, no snacking habit between meals, non-smokers, and no depression symptoms. This population obtained a high adherence to the Mediterranean diet and a high level of moderate physical activity and achieved a medium score in HRQoL.Cluster 3, named “Mediterranean Youth-Adult”, comprised 2013 participants. This group primarily consisted of young and middle-aged individuals (25–55 years), with a higher proportion of women and Caucasians. Participants were university-educated, employed, and lived alone. This cluster informed the lowest prevalence of obesity, diabetes, and familial history compared to other clusters. They self-reported minimal depression symptoms and the highest adherence to the Mediterranean diet. Their physical activity level was moderate, but sedentary hours were high. Most were non-smokers, with a medium HRQoL score.Cluster 4, ‘Pre-morbid’ (n = 600), included diverse age groups, more women, comprising both Caucasians and Hispanics, and often educated at university. Participants informed living with elders, and being employed, students, or unemployed. Many were overweight, declared family HBP and dyslipidemia history. About half experienced depression and most were non-smokers. They self-reported frequent snacking and adding salt. Adherence to the Mediterranean diet was low, but they engaged in slightly more physical activity. The PCS12 was medium, while the MCS12 was relatively lower.Cluster 5 ‘Pro-morbid’ (n = 312) involved middle-aged to elderly individuals, more women, and Caucasians. Mostly declared having university or professional education and living with a partner and children. The cluster included varying proportions of employed, retired, unemployed and disabled individuals. The participants had a balanced distribution between normal weight and overweight, with a noticeable prevalence of obesity. This cluster informed the highest prevalence of cardiometabolic diseases and family history as well as more depressive symptoms. They self-reported sleeping less than 7–8 h per night and had the lowest water intake. Sedentary hours ranged from 5 to 7 h/day to 8–10 h/day. Adherence to the Mediterranean diet, physical activity, and PCS12 were low, while the MCS12 was medium.Table 4 Description of the most relevant characteristics of the participants based on the variables with the greatest importance for the computational phenotyping algorithm.Computational algorithm developmentAfter the forward stepwise regression was performed, the final model identified the following variables: age (18–25 years / 25–40 years / 40–55 years / 55–70 years / >70 years), sex (Female /Male / Do not specify), t-shirt size (XS / S / M / L / XL / XXL), occupation (Unemployed / Student / Disability / Retired / Houseworker / Employed), ethnicity (Caucasian / European / Hispanic / Latin / African / Asian / Mestizo / Other / Prefer not to specify), live alone, with older, with other (Yes / No), sleep weekdays (< 5 h per day / 5–6 h per day / 7–8 h per day / 9–10 h per day / >10 h per day), prevalence of obesity and diabetes (Yes / No), familial obesity, diabetes and HBP (Yes / No / DKDA), water (1–2 glasses per day / 3–4 glasses per day / 5–6 glasses per day / 7–8 glasses per day / 9–10 glasses per day / >10 glasses per day), number of meals (1 or 2 meals per day / 3 meals per day / 4 meals per day / 5 meals per day / ≥6 meals per day), red and processed meats, and butter/cream/margarine (Never or rarely / 1 serving per day / ≥2 servings per day), sugar sweetened beverages (Never or rarely / 1 or 2 servings per day / ≥3 servings per day), fish and seafood (Never or rarely / 1 or 2 servings per week / ≥3 servings per week), preference for white over red meat (Yes / No), moderate physical activity (hours / week), self-perception of health (Excellent / Very good / Good / Fair / Poor), limited in moderate activities and climbing stairs (Yes, limited a lot / Yes, limited a little / No, not limited at all), accomplished less due to physical health or emotional problems (Yes / No), limited in work or other activities (Yes / No), didn’t work as carefully due to emotional problems (Yes / No), pain (Not at all / A little bit / Moderately / Quite a bit / Extremely), downhearted and blue, and calm and peaceful (All of the time / Most of the time / A good bit of the time / Some of the time / A little of the time / None of the time).The beta coefficients (β) of the variables selected for the development of the computational algorithm and the variance contribution of model (R2) of each variable is presented in Supplementary Table S3. The variables that contributed the most to the model were: live with older (27.8%), live alone (12.9%), live with other (6.6%) and limited in moderate activities (1.3%).A computational algorithm was obtained through the formula that allowed estimating the classification of each lifemetabotype (Fig. 3). When the probability of the participants being classified into different groups or clusters was calculated by the random forest algorithm, the following results were obtained: cluster 1 had an 81.9% of being classified in said group, cluster 2 a 94.6%, cluster 3 an 82.5%, cluster 4 a 79.7% and cluster 5 a 77.5% (Supplementary Table S4).Fig. 3Computational algorithm for the classification of phenotypes. Intercept = 3.5092.

Hot Topics

Related Articles