Exploring the value of multiple preprocessors and classifiers in constructing models for predicting microsatellite instability status in colorectal cancer

Patients and dataThe ethics review committee of Nanjing Drum Tower Hospital approved this retrospective study and waived the informed consent form. All the procedures involving human participants were followed in accordance with the 1964 Helsinki Declaration and its later amendments or comparable ethical standards. The data of patients with CRC confirmed by surgery and pathology in our hospital were collected continuously from January 2020 to October 2022. The inclusion criteria were as following: (1) before surgery the patients received abdominal enhanced computed tomography (CT) examination, (2) pathologically confirmed CRC, and (3) MSI status tests by IHC were available. The exclusion criteria were as following: (1) the interval between CT scan and surgery were more than 2 weeks (n = 15), (2) insufficient image quality to distinguish tumor contour due to motion or metal artifacts (n = 18), and (3) any anti-tumor treatment before CT scan (n = 32). Figure 1 presents the specific inclusion and exclusion criteria.Fig. 1Patient screening and grouping process. MSS Microsatellite stability, MSI Microsatellite instability, IHC immunohistochemistry.The collected clinical and pathological indicators included history with or without hypertension, diabetes, sex, age, tumor location, and the clinical of TNM stage. Tumor markers, including CEA, CA125, and CA199, were the results of the last laboratory examination before operation. These results were confirmed by two clinicians.MSI status assessmentThe pathological tissues were stained during IHC using the standard streptavidin–biotin peroxidase process29. Subsequently, the status of MSI was identified by assessing the IHC staining results of four major mismatch repair (MMR) proteins (MLH1, PMS2, MSH2, and MSH6) contained in the tissue. In the four MMR proteins, any lack of expression was considered as MSI, while all positive expressions were considered as MSS30.CT scanAll patients were scanned using the same 160-slice CT scanner (uCT 780, United Imaging Healthcare, Shanghai, China). Each patient received an informed consent form at the time of appointment for CT scan, covering unified pre-examination preparation work. It was necessary to fast for more than 4 h before the examination and take 250–300 mL of water orally 30 min before scanning. In order to improve the standardization of examinations, an integrated scanning protocol had been developed specifically for the patients, including a unified scanning sequence package and contrast agent. Omnipaque (350 mg I/mL, GE Healthcare) with a dose of 1.5 mL/kg was administered through the anterior elbow vein using a high-pressure syringe at rate of 2.5–3.0 mL/s. Each patient underwent plain scanning, followed by three phases of enhanced scanning. Starting from the injection of contrast agent, the triggering of the arterial, venous, and delay phases scans was delayed for 40 s, 70 s, and 180 s, respectively. The scanning field was from the diaphragm top to the pubic symphysis level. The parameters were as follows: tube current: automatic mAs, tube voltage: 120 kV, pitch: 0.9875: 1, rotation time: 0.5 s, matrix: 512 × 512, field of view: 350 × 350 mm. All of the images were reconstructed with hybrid iterative reconstruction (KARL 3D, United Imaging Healthcare, Shanghai, China) at a 5.0-mm layer thickness and 5.0-mm layer spacing.Image processing and feature extractionThe venous phase images were selected and sent to the uAI Research Portal software (Shanghai United Imaging Intelligence, Co., Ltd.). It’s workflow consisted of four parts: image annotation, feature extraction, feature selection, model construction and evaluation (Fig. 2). All tumors were manually drawn by a senior diagnostic radiologist (reader 1 with 11 years of experience), who was blinded to the status of MSI. The cross-section with the largest tumor area was chosen, including necrotic and bleeding areas, while avoiding blood vessels, perienteric fat, intestinal contents, and gas. These areas were marked as regions of interest (ROI) (Fig. 3). The largest tumor was chosen to draw the ROI for patients with multiple ones.Fig. 2Workflow of MSI status prediction of colorectal cancer patients including image segmentation and feature extraction, data grouping, feature and model selection, and model building and evaluation.Fig. 3The tumor with the largest area in cross-section were segmented on venous phase, avoiding the intestinal contents and gas.Two-dimensional radiomics features were collected from the extensive used radiomics toolbox of PyRadiomics31, which contains seven stable feature categories and 14 image filters. Ultimately, 2,259 features were picked up from each ROI. Detailed information on the radiomics features can be obtained in our previous study26.Feature selection and model constructionAfter generating the features, machine-learning methods were utilized to select appropriate features and predict the MSI status in CRC patients. To avoid the sample bias of grouping, a stratified fivefold cross-validation strategy was used to randomly but equally divided all the patients into five partitions to make sure that the same percentage of each class (i.e., MSI/MSS) was preserved in each partition. Finally, five different training and test sets were acquired, and the mean value was taken to obtain a more reliable and accurate sample evaluation. To ensure the robustness and generalizability of each model, the feature selection and prediction process was limited to training set, and the parameters obtained from the training cohort were applied to the test set.Before the feature selection, we first used inter-/intra-correlation coefficients (ICCs) to evaluate inter-/intra-delineator reproducibility. In detail, about two months after the completion of the image delineation, 30 patients32 were randomly selected, and the above steps were finished by reader 1 and another radiodiagnosis physician (reader 2 with 8 years of experience) to segment the images, i.e., manually delineate the ROIs of 30 patients and extract the radiomics features. Features with ICCs less than 0.75 were excluded. Subsequently, the least absolute shrinkage and selection operator algorithm (LASSO) was used to pick the most predictive feature subset within each training set of the fivefold cross-validation. The corresponding coefficients of the selected features were evaluated and utilized to calculate each patient’s Rad-score. The Rad-Score of each sample in the test set was computed based on the LASSO coefficients of the corresponding training set and the feature values of the test set sample itself. The following equation was used to calculate the Rad-score:$$\text{Rad}-\text{score }={\sum }_{i=1}^{n}{C}_{i} \times {X}_{i}+ b$$where \(n\) is the number of selected features, \({C}_{i}\) is the coefficient of the ith feature from the LASSO regression algorithm, \({X}_{i}\) is the ith feature, and \(b\) is the intercept of LASSO.Based on six feature preprocessors (Box-Cox, Yeo-Johnson, Max-Abs, Min–Max, Z-score, and Quantile) and three classifiers [logistic regression, support vector machine (SVM), and random forest], different discriminant models were constructed in the training set using the screened radiomics features. Logistic regression is a well-established and interpretable method, suitable for linear relationship problems33. SVM is known for its ability to handle complex data patterns and nonlinear relationships or when the decision boundaries are not linearly separable34. Random forest, an ensemble learning method, offers robustness and good performance through the combination of multiple decision trees35. These classifiers have been widely used and demonstrated effectiveness in studies36,37,38, making them suitable choices for our analysis.In the test stage, the trained models were applied to the test dataset to predict the probability of being MSI or MSS status. The model with the highest average value of the area under the curve (AUC) in the test set was chosen as the radiomics model. To predict the MSI status, multivariate regression analysis was performed on clinical characteristics with P values less than 0.1 in the difference analysis to screen out the clinical independent factors. The same feature preprocessing algorithm and classifier of the radiomics model were used to develop the clinical screening model and combined model. The clinical screening model was composed of clinical independent factors, whereas the combined model, including the clinical independent factors and the Rad-score derived from the LASSO feature selection process. To provide clinicians a convenient and user-friendly approach for rapidly and accurately estimating the risk of MSI status in individual patients, a nomogram model was developed. It should be noted that all available data was employed for training and estimating the parameters of the nomogram, which allows for a more comprehensive understanding of the overall patterns and relationships. Specifically, the clinical characteristics and Rad-score values were directly obtained by concatenating the test sets from the fivefold cross-validation used in the construction of combined model. Additionally, three features were randomly selected from the features screened by LASSO to perform six data transformations to compare the feature processing results of different preprocessors. In model construction, the hyperparameters were defined using the training set with a grid search to optimize predictive accuracy, detailed information can be found in the Supplementary material.Statistical analysisWe separately used the Mann–Whitney U and the χ2 test to compare the continuous and the categorical variables.The statistical analyses were bilateral, and statistical difference was set to P < 0.05. To evaluate and verify the predictive effectiveness of the models, the receiver operating characteristic (ROC) curves of the clinical, radiomics, and combined models were analyzed, respectively. We used the DeLong test to statistically compare the AUC values obtained from the different prediction models. The average performance of each model was evaluated across the fivefold cross-validation. The clinical applicability and correction effects of the models were compared using decision curve analysis (DCA) and calibration curves. The Brier score (BS) was used to calculate the quantitative analysis of each model performance: BS = 0 indicates that the model performs excellently and the predicted and actual values were identical; BS > 0.25 implies the failure of the model prediction. To address the impact of class imbalance on our calibration curves analysis, the BS value was adjusted based on the class distribution. All statistical tests were executed using IBM SPSS Statistics for Windows, version 26 (IBM Corp., Armonk, N.Y., USA) and R software (version 3.5.2; http://www.Rproject.org). All feature preprocessing and model construction were carried out using the scikit-learn package in Python 3.9.12.

Hot Topics

Related Articles