Evaluating segment anything model (SAM) on MRI scans of brain tumors

This section presents comprehensive evaluation results of the Segment Anything Model (SAM) on two distinct datasets, namely TCGA and BRATS. the overovew of the evaluation process is depicted in Fig. 6. The experiments encompassed four different configurations of SAM, including the baseline SAM, SAM with 1 point, SAM with 5 points, and SAM with a Bounding Box. These specific configurations are commonly used as they offer a balance between computational efficiency and segmentation accuracy, allowing us to assess the model’s performance with varying levels of user interaction. 0 Points Used as a baseline model, relying entirely on the model’s intrinsic capabilities without any user input. 1 Point represents minimal user interaction, providing a single reference point to guide the segmentation process while 5 Points Involves moderate user interaction, offering multiple reference points to enhance segmentation accuracy, especially in more complex scenarios. All the experiments were performed in accordance with the relevant guidelines and regulations. The performance of each model was accessed using two primary evaluation metrics: Dice Score and Intersection over Union (IoU). These metrics provide a quantitative measure of the segmentation accuracy and overlap between predicted and ground truth tumor regions. To gain deeper insights into the model’s performance, we conducted a thorough analysis based on tumor size and curvature. Two key aspects were considered: tumor size, categorized as small, medium, and large, and curvature, categorized as low, medium, and high. The proposed models were trained on using the powerful NVIDIA DGX-1, also known as “The Fastest Deep Learning System,” at the AI and Robotics Lab of United Arab Emirates University. This system consists of dual 20-core Intel®XEON®E5-2698 v4 2.2 GHz CPUs, 40,960 NVIDIA CUDA cores, and 8 Tesla V100 GPUs with a combined GPU memory of 256 GB providing substantial computational capabilities.Fig. 6Representation of steps in model evaluation process.Tumor Size and Curvature AnalysisThis section presents the experimental results of the Tumor Size and Curvature Analysis, providing a detailed examination of the Segment Anything Model (SAM) performance across varying tumor sizes and curvatures. The analysis is conducted separately for each SAM configuration within each dataset to offer a comprehensive understanding of the model’s behavior.Evaluation on TCGA DatasetThe SAM (No Points) model, as evaluated on dataset 1 and summarized in Table 2, demonstrates robust tumor segmentation performance on the TCGA dataset. The heatmaps of the segmentation results on TCGA datasets are shown in Fig. 7. The model achieves an average Dice Score of 0.7064 and an IOU of 0.8307, indicating a strong overall performance in accurately delineating tumor regions. Despite some fluctuations in segmentation accuracy, with Dice Score variations ranging from 0.5107 to 0.8501, the model maintains moderate stability as evidenced by IOU variations within the range of 0.7427 to 0.9319.Fig. 7Represent of SAM dice score heatmaps on TCGA Dataset.Furthermore, the model exhibits consistency across diverse tumor sizes and curvatures, with tumor sizes ranging from low to medium and curvature from low to high. The box plots in Fig. 8 provide a visual representation of the model’s performance, highlighting its proficiency in segmenting low and medium-sized tumors. The model excels particularly well in these scenarios, as evidenced by the higher Dice Scores and IOU values observed in the corresponding box plots.The TCGA-SAM (One Point) model achieves an average Dice Score of 0.6706 and an average IOU of 0.6745, indicating generally strong segmentation accuracy across the dataset. The model’s performance is characterized by variations in Dice Scores ranging from 0.5000 to 0.8500, suggesting some variability in segmentation quality. Similarly, IOU variations span from 0.5000 to 0.8500, indicating moderate fluctuations in the overlap between predicted and ground truth segmentation. As illustrated in the accompanying box plots (Fig. 9), the TCGA-SAM (One Point) model exhibits notable segmentation performance, particularly in cases characterized by low to medium-sized tumors. The box plots reveal that the model consistently achieves higher Dice Scores and IOU values for cases with low and medium-sized and curvature tumors, as evidenced by the upper quartiles of the boxes being higher in these scenarios.In the case of the SAM model with 5 points on dataset 1, the model demonstrates a robust overall segmentation performance with an average Dice Score of 0.6758 and an average IOU of 0.6898. Notably, the model excels in accurately segmenting tumors with low to medium sizes, showcasing effectiveness in scenarios involving smaller to medium-sized tumors as depicted in Fig. 10. Additionally, the model exhibits competence in handling cases with varying levels of tumor curvature, as evidenced by strong segmentation performance in high-curvature tumors. This underscores the model’s adaptability to different tumor shapes.Table 2 Average dice score and IoU for SAM configurations (TCGA Dataset).Fig. 8Segmentation performance distribution across tumor sizes and curvatures: SAM model (0 Point).Fig. 9Segmentation Performance distribution across tumor sizes and curvatures: SAM model (1 Point).Fig. 10Segmentation Performance distribution across tumor sizes and curvatures: SAM model (5 Points).The evaluation results for the SAM model with bounding boxes on dataset 1 reveal strong overall segmentation performance, as depicted in the accompanying box plots in Fig. 11. The model achieves an impressive average Dice Score of 0.6972 and an average IOU of 0.8553, indicating consistent accuracy in delineating tumor boundaries. Particularly noteworthy is the model’s proficiency in accurately segmenting tumors with low to medium sizes as visually reinforced by the upper quartiles of the corresponding box plots. Additionally, the box plots illustrate the SAM model’s robust performance in handling cases with varying levels of tumor curvature. Instances with high curvature tumors exhibit consistently high segmentation accuracy, showcasing the model’s adaptability to different tumor shapes.Fig. 11Segmentation performance distribution across tumor sizes and curvatures: SAM model (SAM with B-Box).Overall, the SAM with bounding box configuration demonstrates superior localized segmentation precision compared to other configurations. The model excels in accurately delineating tumor boundaries, crucial for precise treatment planning. By focusing on the region of interest defined by the bounding box, the model minimizes segmentation errors and produces high-resolution results. The robustness of the SAM model with bounding box parameters is evident in its consistent performance across diverse volumes. Minimal fluctuations in Dice Score and IOU indicate that this configuration is less sensitive to variations in tumor characteristics, making it a reliable choice for scenarios with different tumor shapes, sizes, and curvatures. Compared to other configurations, SAM with bounding box consistently delivers reliable results, highlighting its superiority in achieving accurate and clinically valuable tumor segmentation.Evaluation on BRATS DatasetThe SAM (No Points) model, evaluated on dataset 2 and summarized in Table 3, demonstrates robust performance in tumor segmentation on the BRATS dataset. The heatmaps of the segmentation results on TCGA datasets are shown in Fig. 12.The model achieves an average Dice Score of 0.5737 and an IOU of 0.6156, indicating strong overall accuracy in delineating tumor regions. Despite some fluctuations in segmentation accuracy, with Dice Score variations ranging from 0.3677 to 0.7553, the model maintains moderate stability, as seen in IOU variations within the range of 0.3688 to 0.8195. Moreover, the model shows consistency across diverse tumor sizes and curvatures, ranging from low to high. Fig. 13’s box plots visually represent the model’s proficiency in segmenting low and medium-sized tumors, where it excels, as evident from higher Dice Scores and IOU values.Fig. 12Represent of SAM dice score heatmaps on BRATS Dataset.The BRATS-SAM (One Point) model achieves an average Dice Score of 0.5816 and an average IOU of 0.5817, indicating generally strong segmentation accuracy across the dataset. The model’s performance exhibits variations in Dice Scores (0.4451 to 0.7548) and IOU variations (0.4453 to 0.7548), suggesting some variability in segmentation quality. Illustrated in Fig. 14’s box plots, the BRATS-SAM (One Point) model notably excels in segmenting low to medium-sized tumors. The model consistently achieves higher Dice Scores and IOU values for cases with low to medium-sized and curvature tumors, as shown by the higher upper quartiles in these scenarios.The SAM model with 5 points on dataset 2 demonstrates robust overall segmentation performance with an average Dice Score of 0.5816 and an average IOU of 0.5827. The model excels in accurately segmenting tumors with low to medium sizes, as depicted in Fig. 15. Additionally, the model exhibits competence in handling cases with varying levels of tumor curvature, showcasing strong segmentation performance in high-curvature tumors. This underscores the model’s adaptability to different tumor shapes.Table 3 Average dice score and IoU for SAM configurations (BRATS Dataset).Fig. 13Segmentation performance distribution across tumor sizes and curvatures: SAM model (0 Point).Fig. 14Segmentation performance distribution across tumor sizes and curvatures: SAM model (1 Point).Fig. 15Segmentation performance distribution across tumor sizes and curvatures: SAM model (5 Points).The assessment outcomes for the SAM model with bounding boxes on dataset 2 reveal good overall segmentation performance, as illustrated in the accompanying box plots presented in Fig. 16. The model attains a good average Dice Score of 0.5642 and an average IOU of 0.6516, underscoring its consistent accuracy in delineating tumor boundaries. Notably, the model excels in precisely segmenting tumors with low to medium sizes, as visually emphasized by the elevated upper quartiles in the corresponding box plots. Furthermore, the box plots vividly demonstrate the SAM model’s strong performance in addressing cases with varying degrees of tumor curvature. Instances featuring high-curvature tumors consistently display high segmentation accuracy, highlighting the model’s adaptability to diverse tumor shapes.Fig. 16Segmentation performance distribution across tumor sizes and curvatures: SAM model (SAM with B-Box).The SAM model configured with bounding boxes demonstrates superior precision in localized segmentation compared to alternative setups. It excels in accurately delineating tumor boundaries, a crucial aspect for precise treatment planning. By focusing on the region of interest defined by the bounding box, the model minimizes segmentation errors, providing high-resolution outcomes. Its robust performance is evident across diverse volumes, with minimal fluctuations in Dice Score and IOU, showcasing its reliability in scenarios with varying tumor shapes, sizes, and curvatures. SAM with bounding boxes consistently delivers reliable results, highlighting its superiority in achieving accurate and clinically valuable tumor segmentation.

Hot Topics

Related Articles