Automated Association for Osteosynthesis Foundation and Orthopedic Trauma Association classification of pelvic fractures on pelvic radiographs using deep learning

Proposed deep learning modelsIn this study, we developed and evaluated two deep learning models for pelvic fractures. The first model segments the pelvic ring from collected image data, while the second model classifies fracture patterns using the AO/OTA classification system (Supplementary Fig. S1). We employed a fivefold cross-validation to assess the general performance of both segmentation and classification, and the results were subsequently compared and analyzed.To segment the pelvic ring, we used the Attention U-Net architecture, which focuses on critical regions crucial for accurately delineating complex anatomical structures. For classification, the Inception-ResNet-V2 architecture was chosen to efficiently capture multi-scale features and simplify the learning process with residual connections. This approach, optimized for intricate pelvic structures, surpasses conventional models. The overall methodology and procedures are illustrated in Fig. 1.Fig. 1Overall procedures of the proposed method. (a) Data collection. (b) ROI labeling for the fracture sites (left) and the boundary of the pelvic ring (right). (c) Data preprocessing using histogram equalization. (d) Training of the segmentation and classification models. (e) Performance evaluation of the trained models. ROI, region of interest.Research environmentThe experiment in this study used a system consisting of an NVIDIA RTX A5000 (NVIDIA, Santa Clara, CA, USA) graphics processing unit, an Intel® Xeon® Silver 4216 (Intel, Santa Clara, CA, USA) CPU, and 32 GB RAM, and was performed using the Ubuntu 20.04.6 operating system. NVIDIA driver 525.147.05, Computing Unified Device Architecture (CUDA) 11.2, and Tensorflow 2.6.0 were used.Dataset acquisitionThis study was conducted in accordance with the Declaration of Helsinki and approved by the Institutional Review Board of Gachon University Gil Medical Center (GAIRB2022-153). Due to the retrospective nature of the study, the Institutional Review Board of Gachon University Gil Medical Center waived the need of obtaining informed consent.Antero-posterior (AP) pelvic radiographs were collected from 773 adults ≥ 18 years diagnosed with pelvic fractures and 167 adults without pelvic fractures who visited Gachon University Gil Hospital between January 2015 and December 2020.The location of pelvic fractures was based on the (1) radiologist’s readings of the AP pelvic radiographs, (2) result of the pelvic CT scan, and (3) orthopedic surgeon’s opinion described in the medical records. A trauma surgeon, with > 10 years’ experience, identified the fracture sites on the AP pelvic radiographs. The identified fracture areas were delineated by drawing a square-shaped region of interest (ROI) using ImageJ version 1.53t (National Institutes of Health, Bethesda, MD, USA) and marking the ROI along the boundary of the pelvic ring (Supplementary Fig. S2).The AO/OTA classification system is divided into the major categories of A, B, and C and their subcategories A1–3, B1–3, and C1–311. The AP pelvic radiograph of 773 patients with pelvic fractures was used to classify the fractures as types A, B, and C according to the AO/OTA classification system defined by the orthopedic surgeon. The trauma surgeon made the final classification by confirming the pelvic AP radiograph and pelvic CT findings.Data preprocessingSome of the pelvic AP radiographs were too bright or dark to accurately detect the location of the fracture. A preprocessing called histogram equalization, was applied to these images.Histogram equalization is an image processing method that improves contrast by equalizing the brightness values of the image. By remapping the brightness values, the lost contrast is recovered to obtain a clearer image21,22. Supplementary Fig. S3 shows the pelvic AP radiographs with low contrast converted to high contrast using histogram equalization.Segmentation in pelvic AP radiographs using attention U-NetThe pelvic ring is formed by two innominate bones that articulate posteriorly with the sacrum and anteriorly with the pubic symphysis. Each innominate bone comprises three fused bones, namely the ilium, ischium, and pubis. The sacrum articulates upward with the fifth lumbar vertebra, and the acetabulum on each side of the pelvis articulates with the femoral head. Given that the pelvic AP radiograph shows the lower lumbar spine and proximal femur, we conducted a segmentation process specifically classifying fractures confined to the pelvic ring. In addition, we constructed an artificial intelligence (AI) model using the Attention U-net architecture to generate a segmentation mask for the pelvic ring region in pelvic AP radiographs (Fig. 2)19.Fig. 2Attention U-net architecture.The U-net architecture involves repetitive convolution operations to downsize the input, thereby extracting overall image features. Subsequently, up-sampling progressively reconstructs these features to generate an output image the same size as the original. During the downsizing process, the feature maps generated are combined with corresponding up-sampling layers to prevent information loss, thereby enhancing segmentation accuracy. The Attention U-net incorporates an attention mechanism during the combination stage of feature maps. This mechanism compares the feature map being transmitted with the output of the previous stage, assigning weights to emphasize more critical areas. This addition enhances the accuracy of the segmentation process.For model training, the Dice Coefficient Loss was used as the error function. An Adam optimizer with a learning rate of 0.001 was employed to correct errors. The batch size was set at four, and the model underwent training for 100 epochs. ReduceLROnPlateau was used to dynamically adjust the learning rate, and EarlyStopping was employed to prevent overfitting. Pelvic AP radiographs served as input for model training. An ROI mask specifying the pelvic ring region was used as the label image during the training. Upon completion of the training, the model automatically generated pelvic ring masks with pelvic AP radiographs.AO/OTA classification using inception-ResNet-V2In this study, we developed an AO/OTA classification model within the pelvic ring region using a segmentation model. The developed model is based on the Inception-ResNet-V2 architecture (Fig. 3)20, using segmented images of the pelvic ring area from pelvic AP radiographs as input data (Supplementary Fig. S1). The Inception-ResNet-V2 model combines the Inception network and ResNet, integrating a 1 × 1 convolution layer to reduce computational complexity within the Inception structure and simplifying the model’s learning process using ResNet’s shortcut connections20.Fig. 3Inception-ResNet-V2 architecture.The model’s loss function employed was a Categorical Crossentropy, and an Adam optimizer with a learning rate of 0.00004 was used to adjust the model’s loss. The batch size was set to 4, and the input size was 512 × 512. ReduceLROnPlateau was used to dynamically adjust the learning rate, and multi-processing was utilized. The training was conducted for 300 epochs; however, to prevent overfitting, the training was stopped early. The model’s training was halted if the loss did not decrease > 20 epochs.Performance evaluation and statistical analysisTo assess the performance of the pelvic ring segmentation model, we divided the dataset into five folds for model training and conducted a fivefold cross-validation. Performance metrics, including sensitivity, specificity, accuracy, and the Dice Similarity Coefficient (DSC) were measured23. Similarly, the AO/OTA classification model using the segmented images underwent a fivefold cross-validation. The precision, sensitivity, accuracy, and F1 score were used as performance metrics. The receiver operating characteristic–area under the curve (ROC–AUC) values were analyzed to comprehensively evaluate the performance across different classifications (normal and types A, B, and C fractures)24,25. The DSC and F1 score use the same equation. DSC is widely used in image segmentation, while F1 score is commonly used in classification tasks. In this study, we employed the macro average due to the multi-class classification involved. The macro average calculates the performance indicators for each class in a multi-class classification and then computes the arithmetic mean of these indicators.The precision, representing the ratio of correctly identified positive instances, was calculated using Eq. (1):$$ {\text{Precision }} = {\text{ TP}}/\left( {{\text{TP}} + {\text{FP}}} \right) $$
(1)
Specificity, which represents the proportion of true negative instances correctly identified by a model among all negative instances, was calculated as follows (Eq. 2):$$ {\text{Specificity }} = {\text{ TN}}/\left( {{\text{TN}} + {\text{FP}}} \right) $$
(2)
Sensitivity, indicating the proportion of true positive instances correctly identified by a model among all positive instances, was calculated as follows (Eq. 3):$$ {\text{Sensitivity }} = {\text{ TP}}/\left( {{\text{TP}} + {\text{FN}}} \right) $$
(3)
The DSC, a metric measuring the overlap between predicted and actual regions in a model, was calculated as follows (Eq. 4):$$ {\text{DSC }} = 2{\text{ TP}}/\left( {{\text{2TP}} + {\text{FP}} + {\text{FN}}} \right) $$Accuracy, representing the ratio of correct predictions by a model, was calculated as follows (Eq. 5):$$ {\text{Accuracy }} = \, \left( {{\text{TP}} + {\text{TN}}} \right)/\left( {{\text{TP}} + {\text{TN}} + {\text{FP}} + {\text{FN}}} \right) $$The F1 score, a metric providing a harmonic mean between precision and sensitivity (recall), was calculated as follows (Eq. 6):$$ {\text{F1}}\;{\text{score}}\; = \;{2} \times {\text{ Sensitivity }} \times {\text{ Precision }}/ \, \left( {{\text{Sensitivity }} + {\text{ Precision}}} \right) \, = {\text{2TP}}/\left( {{\text{2TP}} + {\text{FP}} + {\text{FN}}} \right) $$
(6)
The macro average, which is the arithmetic mean of performance indicators for each class in a multi-class classification, is defined as follows (Eq. 7):$$\text{Macro}-\text{average }=\frac{1}{N}\sum_{i=1}^{N}{Metric}_{i}$$Statistical analysis was conducted using MedCalc version 19.6.1 (MedCalc Software Ltd, Ostend, Belgium), SPSS version 23.0 (IBM Corp., Armonk, NY, USA), machine learning frameworks provided by scikit-learn (1.0.2), and Keras (2.6) in Python. Continuous variables are expressed as mean ± standard deviation, and categorical variables are expressed as numbers (%). Statistical significance was set at P < 0.05.

Hot Topics

Related Articles