A coordinated adaptive multiscale enhanced spatio-temporal fusion network for multi-lead electrocardiogram arrhythmia detection

To validate the proposed models, we employed two multi-lead electrocardiogram datasets mentioned in Sect. 4.1. Baseline models referenced in Sect. 4.2 and state-of-the-art models were utilized for comparison, juxtaposed with STFAC-ECGNet.Results of PTB-XL datasetTable 2 presents the measurement results of our models on the PTB-XL dataset’s test set, where the top two rankings for each metric are indicated in bold italics. Our proposed CBMV-CNN, CAMV-RNN, and the combined STFAC-ECGNet have achieved excellent performance across all metrics. CBMV-CNN achieved the highest Accuracy (0.880), highest AUC (0.935), highest Recall (0.796), Precision (0.725), and F1-score (0.749). CAMV-RNN attained Accuracy (0.889), AUC (0.932), Recall (0.783), Precision (0.748), and F1-score (0.763). STFAC-ECGNet achieved the highest Accuracy (0.894), ranked second in AUC (0.933), Recall (0.756), Precision (0.778), and ranked second in F1-score (0.767). Therefore, our proposed networks outperform other baselines and state-of-the-art networks in terms of Accuracy, AUC, Recall, and F1-score, but there is room for improvement in Precision. Overall, using F1-score as the benchmark, STFAC-ECGNet only differs from the state-of-the-art network 2D-ECGNet by a margin of 0.003. Table 2 Diagnostic performance comparison with other baseline and advanced networks on the PTB-XL test dataset.Table 3 details the evaluation metrics of distinct networks across superclass divisions on the PTB-XL dataset. In the CD (Cardiac Dysrhythmia) diagnostic category, STFAC-ECGNet secured the top Accuracy and Precision, registering scores of 0.892 and 0.869, respectively. For the HYP (Hypertrophy) classification, CBMV-CNN excelled in Accuracy, achieving a remarkable 0.918. In the MI (Myocardial Infarction) category, CBMV-CNN led in AUC, Recall, and F1-score, with exemplary scores of 0.943, 0.863, and 0.853, respectively. STFAC-ECGNet matched CBMV-CNN’s prime AUC achievement in this segment. Within the NORM (Normal) classification, STFAC-ECGNet attained a AUC of 0.949, a score equaled by CAMV-RNN, which also realized the highest Recall at 0.883. For the STTC (ST-T Change) group, STFAC-ECGNet clinched top scores in Accuracy, Precision, and F1-score, noted at 0.894, 0.846, and 0.845, correspondingly, with CAMV-RNN achieving an optimal F1-score. CBMV-CNN recorded peak performance in AUC and Recall, with notable scores of 0.938 and 0.856. In summary, using the F1-score as a reference, STFAC-ECGNet surpasses all competing advanced models in performance.
Table 3 Diagnostic performance comparison with other baseline and advanced networks on the PTB-XL test dataset in five diagnostic categories. The bold font indicates the highest value of the metric for each diagnostic category across different metrics and comparison models.Results of CPSC 2018 datasetsAccording to the data results from Table 4, our proposed network exhibits remarkable performance in the diagnostic categories of I-AVB, LBBB, RBBB, STD, and STE, as highlighted using bold italics font. STFAC-ECGNet achieves F1-scores of 0.940 and 0.762 in the RBBB and STE diagnostic categories, respectively. CAMV-RNN achieves F1-scores of 0.911, 0.940, 0.846, and 0.848 in the I-AVB, RBBB, STD, and average F1-score, respectively. STFAC-ECGNet achieves F1-scores of 0.905, 0.918, 0.940, 0.853, 0.765, and 0.852 in the I-AVB, LBBB, RBBB, STE, and average F1-score, respectively. Our proposed model surpasses baseline models and state-of-the-art models in these metrics. While our model may not exhibit a significant advantage in the Normal, AF, PAC, and PVC diagnostic categories, overall, the F1-score surpasses baseline models and advanced models. Additionally, we observed that the CAMV-RNN network is relatively more adaptable to the CPSC 2018 dataset compared to the STFAC-ECGNet network, possibly due to the smaller data size and larger model parameters. However, STFAC-ECGNet performs exceptionally well in the STE diagnostic category, outperforming CAMV-RNN by an additional 0.009 F1-score. Furthermore, STFAC-ECGNet also outperforms the baseline models and advanced models. When either the CASSAN module is omitted or the simplified self-attention 58 (SSAN) module, lacking scale and mask operations, is removed, and only the CBAM module is retained in STFAC-ECGNet, the model exhibits improved adaptability to the dataset. This is because we performed upsampling on the original data, adjusting its length to 60 s. In the time series processing module, the CASSAN module or SSAN module tends to focus on the padding section, resulting in the original STFAC-ECGNet underperforming in various metrics compared to the version with only the CBAM module. The STFAC-ECGNet with only the CBAM module is superior to CAMV-RNN, as it introduces the CBMV-CNN module, enhancing the feature extraction capability for grayscale images. The CBAM module better concentrates attention on identifying the padding and non-padding areas of the image region.
Table 4 Diagnostic performance comparison with other baseline and advanced networks on the CPSC test dataset. The top two rankings for each diagnosis are indicated in bold italics.Ablation experiments on CBMV-CNNAs shown in Fig. 5, this study investigates the impact of integrating the CBAM module on the performance of the CBMV-CNN network on electrocardiogram datasets (PTB-XL and CPSC 2018). In the PTB-XL dataset, the CBMV-CNN model without the CBAM module demonstrates an accuracy of 0.880 and an AUC of 0.933. Upon introducing the CBAM module, the accuracy of the CBMV-CNN network improves by 0.1%, and the AUC increases by 0.2%. Additionally, the CBAM module brings about a slight improvement in recall, increasing from 0.780 to 0.796. However, there is a slight decrease in precision, from 0.736 to 0.725, while the F1 score shows a slight increase, reaching 0.759. In the CPSC 2018 dataset, the introduction of CBAM leads to a significant improvement in model performance. The CBMV-CNN network with CBAM integration demonstrates a 0.2% increase in accuracy, a 1.0% increase in AUC, and a 2.5% increase in recall, with a slight decrease in precision and a 1.2% increase in F1 score. These findings indicate that integrating the CBAM module has a positive impact on the performance of the CBMV-CNN network on electrocardiogram datasets. This suggests that the CBAM module can better capture critical features in electrocardiogram data, thereby enhancing the model’s classification performance and generalization ability in the context of deep learning and electrocardiography.Fig. 5The CBMV-CNN conducted ablation experiments on two datasets.Ablation experiments on CAMV-RNNAs show in Fig. 6, this study examines how the performance of the CAMV-RNN network on electrocardiogram datasets (PTB-XL and CPSC 2018) is affected by the introduction of our proposed CASSAN module and simplified self-attention mechanism module. The results obtained on the PTB-XL dataset indicate that integrating the CASSAN module into the CAMV-RNN architecture results in significant improvements in accuracy, precision, and F1 score, with enhancements of 0.4%, 0.11%, and 0.4%, respectively. Introducing the SSAN module increases accuracy and precision by 0.4% and 0.16%, respectively, while AUC remains unchanged. However, a decrease in recall leads to a decline in the overall evaluation metric, the F1-score. The comprehensive effect of introducing the SSAN module on F1-score is not as significant as without introducing either the CASSAN or SSAN block. On the CPSC 2018 dataset, the integration of the CASSAN module into the CAMV-RNN architecture similarly enhances accuracy, AUC, recall, precision, and F1 score by 0.1%, 0.1%, 0.6%, 0.4%, and 0.6%, respectively. Conversely, the improvements observed in various metrics with the introduction of the SSAN module are not as substantial as those achieved with the CASSAN module. In conclusion, the CASSAN module demonstrates more significant performance advantages over the SSAN module on these two datasets, possibly due to its more effective capture of features and patterns in the datasets.Fig. 6The CAMV-RNN conducted ablation experiments on two datasets.Ablation experiments on STFAC-ECGNetIn the ablation experiments illustrated in Fig. 7, we observed that the CBAM module consistently enhances the accuracy and F1 score of the model on both the PTB-XL and CPSC2018 datasets, highlighting the effectiveness of attention mechanisms in enhancing the model’s ability to capture key features. Within the STFAC-ECGNet framework, the incorporation of the CASSAN module leads to superior outcomes when compared to integrating the SSAN module. Adding the CBAM module to any configuration demonstrates potential improvements in various performance metrics, but the combination of CASSAN and CBAM exhibits the best performance. Additionally, the different responses of the same model configuration to different datasets underscore the importance of dataset-specific feature selection and model tuning. On the PTB-XL dataset, the combination of CASSAN and CBAM performs well, while on the CPSC2018 dataset, a simpler configuration or an independent CBAM module alone is sufficient to achieve outstanding performance. These disparities can be attributed to differences in dataset scale and network parameters. In general, the proposed STFAC-ECGNet, incorporating the combined CASSAN and CBAM mechanisms, exhibits superior performance compared to baseline networks and state-of-the-art models.Fig. 7The STFAC-ECGNet conducted ablation experiments on two datasets.Networks parametersAccording to the network parameter results in Table 5, the parameter configurations for the CAMV-RNN, CBMV-CNN, and STFAC-ECGNet models are as follows: The CAMV-RNN model has a total parameter count of 2.68 M, with an input size of 2.30 MB. The memory required for forward/backward propagation is 1376.53 MB, with parameter storage occupying 10.72 MB. The estimated total memory usage is 1389.55 MB. For the CBMV-CNN model, the total parameter count is 7.65 M, with the same input size of 2.30 MB. The memory required for forward/backward propagation is 14,167.20 MB, with parameter storage occupying 30.61 MB. The estimated total memory usage is 14,200.11 MB. Lastly, the STFAC-ECGNet model has the largest total parameter count, at 10.08 M, with an input size of 2.30 MB. The memory required for forward/backward propagation is 15,543.45 MB, with parameter storage occupying 40.27 MB. The estimated total memory usage is 15,586.02 MB.
Table 5 Networks parameters.Prototype visualizationVisualization of t-SNE on PTB-XL and CPSC 2018As shown in Fig. 8a, t-SNE was employed to visualize the feature representations, thereby validating whether the diagnostic categories of electrocardiograms (ECGs) are well-clustered within the STFAC-ECGNet model. In the color-coded area chart on the Y-axis, 0 represents CD, 1 represents HYP, 2 represents MI, 3 represents NORM, and 4 represents STTC. From Fig. 8a, it is evident that in the high-dimensional space, NORM, CD, MI, and STTC all exhibit distinct boundaries. This is particularly noticeable in the STFAC-ECGNet model, which integrates both CASSAN and CBAM, showcasing superior overall arrhythmia diagnostic classification performance.Fig. 8Visualization of the STFAC-ECGNet network on two different datasets using t-SNE: PTB-XL (a) and CPSC 2018 (b).As shown in Fig. 8b, the color-coded area chart on the Y-axis indicates that 0 represents STE, 1 represents AF, 2 represents LBBB, 3 represents RBBB, 4 represents I-AVB, 5 represents PAC, 6 represents Normal, 7 represents STD, and 8 represents PVC. According to Fig. 8b, the color regions on the Y-axis correspond to different types of cardiac arrhythmias. In the high-dimensional space, except for PAC, all other cardiac arrhythmia types display distinct boundaries. This suggests that the PAC diagnostic category is mostly scattered, leading to relatively lower F1-score performance in model evaluation, whereas other diagnostic categories exhibit excellent performance. This finding confirms the reliability of the experimental results.Confusion matrix plots for PTB-XL and CPSC 2018 diagnosisThe performance of STFAC-ECGNet is depicted in Table 6 through its confusion matrix, showcasing its effectiveness on the PTB-XL dataset. The true positive (TP) rates for CD (conduction disturbance), HYP (hypertrophy), MI (myocardial infarction), and STTC (ST/T change) all exceeded 0.90, specifically 0.94, 0.95, 0.92, and 0.90, respectively. However, this TP rate does not provide clear indications for diagnosing arrhythmias. It is imperative to substantially improve the true negative (TN) rates while maintaining high TP rates. The TN rate for NORM (Normal ECG) is the highest at 0.94, whereas the TN rates for the four diseases range from 0.63 to 0.82. This indicates that the classifier can better diagnose the five classes’ true positives. Compared to the network proposed by Gokhan Kutluana et al.59, our proposed network has increased the TN rates for CD category by 0.03, HYP category by 0.16, MI category by 0.10, STTC category by 0.06, and NORM category by 0.01.
Table 6 Confusion matrix of diagnostic results on the PTB-XL dataset.Table 7 illustrates the performance of STFAC-ECGNet on the CPSC 2018 dataset through the confusion matrix. We employed ten-fold cross-validation to comprehensively evaluate the model across tenfold experiments. The comprehensive performance indicates that our network achieves a true positive (TP) rate exceeding 0.90 for all diagnoses. Particularly noteworthy are its outstanding performances in atrial fibrillation (AF), first-degree atrioventricular block (I-AVB), left bundle branch block (LBBB), premature ventricular contractions (PVC), right bundle branch block (RBBB), and ST-segment depression (STD), reaching 0.93, 0.87, 0.89, 0.85, 0.94, and 0.77, respectively. Meanwhile, the true negative rates (TNR) of STFAC-ECGNet for the NORM, premature atrial contractions (PAC), and ST-segment elevation (STE) categories are relatively lower at 0.34, 0.52, and 0.54, respectively. This reaffirms the reliability and validity of the F1 score data reported in the experimental results.
Table 7 Confusion matrix of diagnostic results on the CPSC 2018 dataset.

Hot Topics

Related Articles