A permutable MLP-like architecture for disease prediction from gut metagenomic data | BMC Bioinformatics

In this section, we will provide an exposition of the experimental setup and subsequently delve into an analysis and discussion of the experimental results.Table 2 Classification performance comparison of MetaP with existing methodsExperimental setupThe Permutable MLP-like network structure parameter settings for our model applied to metagenomic classification are described as follows. Due to the requirement of ensuring divisibility between the patch dimensions and the 2D matrix, we implemented specific parameter settings. Specifically, the patch width was set to 3, and during the construction of the 2D matrix, we ensured that its width was divisible by 3. Furthermore, the patch height was set to match the reduced matrix height, which typically does not exceed 8. For example, in the T2D dataset, the matrix height was only 5.There are several hyper parameters in MetaP, including the number of Permutator blocks \(P_n\), the channel dimension split segments \(S_n\), the hidden units in the neural network \(h_u\), the initial learning rate lr of the Adam [26] optimizer, the batch size bs, the number of iterations epoch, and two parameters (step size \(\alpha\) and decay factor \(\gamma\)) associated with the Step learning rate scheduler (StepLR) provided by PyTorch for learning rate scheduling. These parameters are explored over different combinations from the following ranges: \(P_n \in \{1, 2, 3\}\), \(S_n \in \{2, 4, 6, 8, 10\}\), \(h_u \in \{32, 48, 64, 128\}\), \(lr \in \{0.05, 0.005, 0.0005, 0.00005\}\), \(bs \in \{16, 32, 64, 128\}\), \(epoch \in \{10, 20, 30, 40, 50\}\), \(\alpha \in \{5, 10\}\), \(\gamma \in \{0.1, 0.3, 0.5, 0.7\}\). After adjustments, we identified the optimal parameters for MetaP in subsequent experiments as \(P_n=1, S_n=8, h_u=48, lr=0.0005, bs=32, epoch=20, \alpha =5, \gamma =0.5\).In order to verify the effectiveness of our proposed method, we compared it with the MetAML [11], MetaNN [16], and PopPhy-CNN [10] methods proposed in previous studies. Besides, for the other comparison models, we followed the recommended hyperparameter settings and model architectures outlined in their respective original papers or the provided open-source code. In our experiment, we performed 10-fold cross-validation 10 times and then computed the average to ensure the accuracy of our results. We employed five metrics, namely Area under curve-receiver operating characteristic (AUC), Matthews correlation coefficient (MCC), Precision, Recall, and F1 score, to comprehensively evaluate the model from different perspectives.We separately evaluated the performance of our model in binary and multiclass classification tasks. For binary classification task, we conducted experiments on the three publicly available datasets mentioned in previous sections. For multiclass classification task, we created a Multi-Disease dataset consisting of 829 samples with multiple diseases by combining these three datasets based on species-level abundance features present in each sample.Fig. 3Comparing the ROC curves of MetaP and other five methods based on 10-fold cross-validation on the three disease datasetsTable 3 Comparison of PopPhy-CNN and MetaP on dense versus sparse matrix classification performanceThe classification performance of modelsThe classification performance results of different models on the dataset are presented in Table 2, and for better visual comparison, the corresponding ROC curves of RF, SVM, MLPNN, 1DCNN, PopPhy-CNN, and MetaP are shown in Fig. 3. Among the machine learning algorithms, Random Forest(RF) exhibits the best classification performance in the metagenomic datasets. Our experimental results demonstrate that our proposed MetaP model achieved the best classification performance on the Obesity, T2D, and Multi-Disease datasets, while exhibiting slightly lower performance compared to RF on the Cirrhosis dataset. We attribute the superior classification performance of our MetaP model on these datasets to their larger sample size compared to the Cirrhosis dataset. Previous studies [27] have also demonstrated that deep learning models generally outperform traditional machine learning methods when applied to datasets with a larger number of samples.Additionally, we trained the PopPhy-CNN and MetaP models on both sparse and dense matrices. Our experimental results, as shown in Table 3, demonstrate that in most cases, reducing the matrix height to mitigate data sparsity does not have an adverse impact on the model’s classification performance. On the contrary, the model’s classification performance exhibits a slight improvement, which may be attributed to the enhanced focus on important features and the reduction of interference from redundant information. However, the PopPhy-CNN model using convolutional neural network experiences a slight decline in classification performance for the Obesity dataset with fewer features, whereas our proposed MetaP model remains unaffected. Overall, we believe that reducing the matrix height in this manner is effective in improving computational efficiency without compromising classification performance.Identification of disease-associated microbial featuresTo interpret the black box nature of the MetaP model and identify the microbial features that play a significant role in classification, we utilized the Kernel SHAP method on our model to obtain the SHAP values corresponding to each feature from a 10-fold cross-validation. In Fig. 4, we present the top 20 important microbial features based on the SHAP values in each of the three disease datasets, along with their average relative abundances in the samples. Similarly to previous studies [11], We found that the importance of each microbial feature was not strongly correlated with its average relative abundance across samples, indicating the complexity of the microbial system.Fig. 4The top 20 microbial features with the highest SHAP values in each of the three disease datasets. For each species, the SHAP values on the vertical axis decrease from top to bottom, while the two horizontal bars represent the average relative abundance observed in the healthy samples (depicted in green) and the diseased samples (depicted in red). The positive or negative signs inside the parentheses indicate the positive or negative values of the SHAP valuesIn the Cirrhosis dataset, which exhibited the best performance in terms of model classification, our model identified the following important microbial features: the Veillonella, Lactobacillus, and Streptococcus genera. These important features were also identified in the original study [7, 28]. In the T2D and Obesity datasets, which exhibited lower classification accuracy, the important microbes identified by our model can serve as candidate sets for future experimental studies investigating the association between microbes and diseases. And our model also detected important species that have been previously reported in the literature. For example, associations between specific species such as Lactobacillus mucosae [29] and Olsenella spp. [30] with T2D patients, as well as associations between species like Ruminococcus bromii [31] and Eubacterium siraeum [32] with Obesity patients were identified. Additionally, we observed a correlation between the positive/negative SHAP values and the differences in the average abundance of microbes between the patients and healthy individuals, particularly in datasets with good classification performance. Microbes that have a higher relative abundance in the diseased population often exhibit positive SHAP values. For instance, in the Cirrhosis dataset, among the top 20 important microbes, only Megamonas spp. and Adlercreutzia equolifaciens showed higher abundance in the healthy population with negative SHAP values, while the remaining microbes exhibited higher relative abundance in the diseased population. In conclusion, we believe that improving the classification performance of models on metagenomic data can unveil potential associations between microbes and diseases, warranting further investigation.

Hot Topics

Related Articles