Real-time segmentation of biliary structure in pure laparoscopic donor hepatectomy

PatientsThis study is a single-institution retrospective feasibility analysis which includes 30 intraoperative videos of PLDH from Samsung Medical Center, utilizing the intraoperative ICG near-infrared fluorescence method between January 2021 and April 2022. Our center has extensive experience in PLDH, having performed over 600 cases20. The surgical team consists of four experienced donor surgeons, although the videos included in this study were all from procedures performed by a single surgeon (GS. Choi). All donors were injected 0.1 mg/kg of indocyanine green (Dianogreen, Daiichi Sankyo Co, Tokyo, Japan) intraoperatively about 30 min before exposure of the hilar plate21. The biliary structures were clearly visualized by using infra-red endoscopic camera (IR Telescopes 10 mm, Olympus, Tokyo, Japan). The types of bile ducts were classified according to the modified classification system proposed by Huang et al.: type I, normal type; type II, trifurcation of right anterior, right posterior, and left hepatic duct; type III, right posterior duct draining into left hepatic duct; type IV, early branching of right posterior duct from the common hepatic duct; type V, right posterior duct draining into cystic duct; type VI, other types of variation in bile duct anatomy22. For detailed information regarding our bile duct division technique, we refer readers to our previously published paper23. The study was conducted in accordance with the Declaration of Helsinki and the Istanbul Declaration, and was reviewed and approved by the Institutional Review Board (IRB, SMC-2022-07-149-001). Due to the retrospective nature of the study, IRB of Samsung Medical Center waived the need of obtaining informed consent.Video segmentation and training datasetThe videos were recorded in MP4 format with a display resolution of 1920 × 1080 pixels and a frame rate of 30 frames per second (fps). Frames were extracted at a rate of 10 fps from each video, capturing the period from bile duct isolation to the opening of anterior wall of right hepatic duct. This was achieved using ffmpeg 4.1 software (www.ffmpeg.org). Frames with obscured fields due to smoke, completely obscured biliary structures by surgical instruments, or camera positioned outside the surgical field were excluded. Finally, 300 images (10 images from each of the 30 intraoperative videos) were selected for the model training and validation. The five-fold cross-validation was performed on 30 videos. For each validation cycle, four out of five groups (24 videos) were used to train the model, while the remaining group (6 videos) was reserved for validation. This process was repeated five times, with each group serving as the validation set once and as part of training set four times (Fig. 1).Fig. 1Schematic representation of five-fold cross-validation. Each row represents one of the five ‘folds’ used in the validation process, with a total of 30 patient videos divided into training and validation sets. The columns represent individual patient videos, each containing 10 images, as indicated by the numbers 1 through 30. Shaded boxes within each fold indicate the videos selected as the validation set for that particular cycle, with the remaining videos used as the training set. Each video serves as part of the validation set once throughout the five cycles, ensuring that every video contributes to the validation of the model, while being used four times in the training set.Annotation of biliary structureIn every intraoperative video included in this study, biliary structures were confirmed using the indocyanine green (ICG) near-infrared fluorescence method (Fig. 2a,b). Pixel-wise labeling of biliary structure and transection site was performed with reference to the ICG fluorescence images from each intraoperative video. Annotations were completed using the Computer Vision Annotation Tool (www.cvat.org, Intel). The proposed model is designed to perform segmentation in two distinct ways. First, segmentation was carried out to mask the entire biliary structure, as predicted with reference to the ICG image (annotated as BD; bile duct, Fig. 2b,e). Second, annotation was performed to mask the anterior wall of the junction of common and right hepatic duct, which represents the area of interest for the operator when opening the bile duct (annotated as AW; anterior wall, Fig. 2c,f). Annotation was performed by fellow surgeon (N. Oh) and confirmed by senior surgeon who experienced more than 300 cases of PLDH (GS. Choi).Fig. 2Ground truth annotation process. This figure illustrates the step-by-step process of creating ground truth annotations for the biliary structure segmentation by referencing indocyanine green (ICG) cholangiography. (a) Actual surgical images extracted from the procedure, (b) Structures of the bile duct extracted from ICG cholangiography, (c) The actual site where the bile duct was transected during surgery, (d) Demonstrates the Dice Similarity Coefficient (DSC), quantitatively showing the level of agreement between the ground truth and the AI-inferred regions, (e) Manual segmentation of the bile duct structure, generated with reference to (b), (f) Segmentation of the anterior wall, representing the proposed area for bile duct transection, created with reference to (c).Deep learning modelThe model architecture employed DeepLabv3+ as its foundation, with ResNet50 pre-trained on the ImageNet dataset serving as the encoder24,25,26. To address the limitation of a small dataset, data augmentation techniques such as geometric transformations (flips, rotations, etc.), color transformations (contrast, saturation, hue, etc.), and Gaussian noise and patch-based zero masking were applied. These augmentation techniques increased the DSC by 1.4% points compared to not using them (Supplementary Table 1). All data were normalized according to the mean and standard deviation of each RGB channel and resized to a pixel size of 256 by 256. The model’s hyperparameter details are provided in Supplementary Table 2.ComputingWe utilized Python as our programming language and Pytorch, an open-source machine learning framework, for segmentation AI modeling. The computational resources employed included an Nvidia GeForce RTXTM 3060 with 12GB of VRAM as the GPU, and an AMD RyzenTM 5 5600X 6-Core Processor @ 3.7 GHz with 32GB of RAM as the CPU.Evaluation metricsThe model’s performance was assessed using the Dice Similarity Coefficient (DSC) between the manually segmentation and prediction by deep learning model, which calculates the harmonic mean of precision and recall. This metric demonstrates the extent to which the model’s predicted region overlaps with the ground truth image (Fig. 2d). The DSC ranges from 0 to 1, with higher values indicating a closer match between the predicted and ground truth images. In this study, the average DSC was calculated for two classes (BD and AW). The DSC is defined as follows:DSC (Dice Similarity Coefficient) = 2×TP/(2×TP + FP + FN), precision = TP/(TP + FP), and recall = TP/(TP + FN), where TP (True Positive) denotes cases where both the predicted and ground truth values are positive. FP (False Positive) refers to cases where the predicted value is positive, but the ground truth value is negative. FN (False Negative) represents cases where the predicted value is negative, but the ground truth value is positive.

Hot Topics

Related Articles