Noninvasive, label-free image approaches to predict multimodal molecular markers in pluripotency assessment

Distinct conditions to culture iPSCsTo observe variations in pluripotency status of iPSCs under distinct culture conditions and different molecular measurements, we defined an experimental arrangement (Fig. 1A). Cell viability analysis showed a nonsignificant decrease in cell viability in Cond 2 and Cond 4 cells (Fig. 1B). Microscopic imaging was performed at 20 time points on days 1 and 4 after cell seeding. Cells from all replicates in each condition were harvested on Day 5, distributed, and used for molecular assay-based pluripotency assessment using flow cytometry, immunocytochemistry, qPCR, and RNA-Seq. Three technical replicates were tested for each condition.Figure 1Experimental arrangement of distinct culture conditions. (A) For all four conditions, cell imaging was performed on Days 1–2 and Days 4–5 after cell seeding. White and black triangles indicate cell media change to pluripotency media and condition-specific media, respectively. Cells were harvested on Day 5, followed by molecular assessment. (B) The viability of cells harvested on Day 5 was measured using ViCELL XR Cell Viability Analyzer and FCM.Image data acquisition of iPSCsFor cell imaging, we employed time-lapse bright-field imaging and created the final cell image (Fig. 2A) using a commercial software called CellPathfinder. The resulting image dataset for each sample comprised 400 FOVs. Representative FOV images of each sample are shown in Fig. 2B. The morphological differences in iPSCs among Cond 1, Cond 2, and Cond 4 were undetected through visual inspection by human experts in iPSC maintenance. In contrast, in Cond 3, morphological transformation of the cells was visible, where the outer contour of each cell was clearer, the cell shape was flattened, and confluence was higher.Figure 2Image data acquisition pipeline. (A) Plate arrangement for four conditions (left). There are five 5 × 8 square regions for 200 fields of view (FOVs) in a single well (middle). In each field, five planes were acquired at 5-μm intervals along the Z-axis, which were overlaid into a single image for use in image analysis (right). (B) Representative FOVs of the replicates in each of the four conditions in the same coordinate. We selected the image from the 88th FOV. A scale bar of 200 μm is shown at the bottom right. The original images were available as Supplementary File.Multimodal molecular data acquisition of iPSCsIn FCM analysis, we employed SSEA-4 and Tra-1-60 as cell surface markers for pluripotency enrichment. The FCM-derived pluripotency ratio was defined as the percentage of double-positive cells per total live cells based on manual- or auto-gating methods (described in the Methods section) (Fig. 3A, Supplementary Fig. 1). As a result, the pluripotency ratio remained relatively high with both manual- and auto-gating, and the mean values of all conditions were > 90% (Fig. 3B). The mean pluripotency ratio of the three replicates in Cond 2 was lower than that in the control condition with manual- and auto-gating, but was not significant due to large intra-condition variance. With manual-gating, the decrease in the pluripotency ratio in Cond 3 cells was slightly significant (p < 0.1, Welch’s t-test), whereas the decrease in the ratio in Cond 3 cells was not observed with auto-gating. The other automated gating function, tailgate, which is suitable for detecting minimum cutpoint in only one major peak, was tested whether the results were common with mindensity2-based automated gating (Supplementary Fig. 2). Although the decrease in positive cells in Cond2 and Cond4 were emphasized in the tailgate-based pipeline, the positive cell profiles in two automated pipelines (mindensity2 and tailgate) were well correlated (r = 0.95), and the impacts of the downstream analysis in comparison with model prediction are supposed to be minimal.Figure 3Pluripotency assessment by molecular measurements using different molecular assays. (A) Workflows of FCM and immunochemistry data processing. Each sample was assessed using manually and computationally determined values to split the data points into pluripotency or nonpluripotency. (B) Pluripotency ratio assessed in SSEA-4 and Tra-1–60 double-positive cells by FCM analysis. (C) Pluripotency ratio assessed with NANOG- and OCT4-positive cells by immunochemistry. (D,E) Gene expression levels of NANOG and POU5F1 assessed using qPCR and RNA-Seq. (F) Volcano plots of RNA-Seq data (TPM) between Cond 1 and other conditions, where X- and Y-axes indicate Log2-transformed fold change and Log10-transformed adjusted p-values, respectively. Genes with expression (TPM) < 1 in any of the 12 samples were excluded before plotting. (G) PCA plot of RNA-Seq data, where TPM value was used as input. (H) Top five enriched pathways for PC1 and PC2 loading genes analyzed by GSEA against GO biological processes gene sets. In panels (B–E) single asterisks indicate p-value < 0.1, and double asterisks indicate p-value < 0.05 by Welch’s t-test, comparing the means of three replicates of control (Cond 1) and other conditions, respectively.Immunostaining was performed for proteins immunoprecipitated using antibodies against OCT4 and NANOG. Based on the density plot of the fluorescence intensity of each protein per cell, manual and automated thresholding was applied to eliminate arbitrariness (Fig. 3A, Supplementary Fig. 3). To use an automated binarization approach, we employed the BASC algorithm. The number of positive cells with an intensity above each threshold per total cell number was calculated for each sample as an immunochemistry-derived pluripotency ratio indicated by each marker protein (Fig. 3C). The NANOG-positive cell ratio in Cond 1 cells with the BASC threshold was lower than that with the manual threshold. At the BASC threshold, Cond 2 and Cond 3 showed significantly decreased positive cell ratios. The depletion in the ratio was clearer in Cond 3. In contrast, with the manual threshold, the decrease was moderately significant in Cond 2 only (adjusted p-value < 0.1) and the mean profile was similar to that in FCM-assessed pluripotency with manual-gating (Pearson’s correlation coefficient was 0.93). The OCT4-positive cell ratio in Cond 1 with the BASC threshold was lower than that with the manual threshold, whereas the overall positive cell ratio was higher in OCT4 than that in NANOG. Consistently, NANOG exhibited a high degree of heterogeneity with a broad distribution of expression values in maintained pluripotent cells, whereas OCT4 exhibited more uniform expression23. The BASC threshold resulted in slightly significant and nonsignificant decreases in the mean positive cell ratios for Cond 2 and Cond 3, respectively. In contrast, the manual threshold showed significant decrease and significant increase in OCT4-positive cell ratios in Cond 2 and Cond 3, respectively.To further assess the differences in intracellular pluripotency markers across conditions, we performed qPCR for POU5F1 and NANOG, as well as RNA-Seq. In qPCR analysis, Cond 2 showed significant depletion of NANOG and POU5F1 expression compared to the control conditions (left and right panels Fig. 3D, respectively). In contrast, the decrease in NANOG expression was more intense in Cond 3 than in Cond 2, and POU5F1 expression in Cond 3 was unchanged compared that to in Cond 1. Consistent results were obtained with the RNA-Seq analysis of NANOG and POU5F1 expression (Fig. 3E). Moreover, the NANOG expression pattern was highly correlated with the FCM-based pluripotency ratio with manual-gating and immunostaining-derived NANOG-positive cell ratio with BASC thresholding. The POU5F1 expression pattern was correlated with the FCM-based pluripotency ratio with auto-gating and immunostaining-derived NANOG-positive cell ratio with manual thresholding. Interestingly, a nonsignificant but considerable decrease and large variation among samples was observed in NANOG expression by qPCR. The results of the molecular assessment of pluripotency markers are listed in Supplementary Table 4.The differential-expression volcano plots of RNA-Seq data indicated that many genes were significantly up- or downregulated in Cond 2 and Cond 3 and changed little in Cond 4 compared with Cond 1 (Fig. 3F). Interestingly, many pathways related to cell morphology, cell number, cytoskeleton, cell size, and cell adhesion were enriched as significantly up- and downregulated pathways in Cond 2 and Cond 3, respectively (Supplementary Table 5). The genes involved in these pathways are mapped to a volcano plot (Fig. 3F).PCA results of RNA-Seq data concurred with the difference in the cellular states cultured in Cond 2 and Cond 3, indicating a possible deviation of cell populations from the pluripotent state (Fig. 3G). Changes in cell status in Cond 2 and Cond 3 due to pluripotency are described by variations along different PC axes, PC2 and PC1, respectively. The GSEA results of the PC1 gene set indicated that culture in differentiation media in Cond 3 led to changes in cell cycle-related biological processes, and lowered nutrients in Cond 2 stimulated the gene set to be highly enriched in stem cell differentiation (adjusted p-value is low: 2.87e-6) and appendage and limb morphogenesis, as indicated in the GSEA results of the PC2 gene set (Fig. 3H).Thus, Cond 2 is characterized by a consistently low expression of Nanog and Oct4 genes and proteins at single-cell and population levels, which may have been caused or influenced by changes in the expression of genes related to stem cell differentiation. However, Cond 3 decreased Nanog protein and gene expression but did not significantly change Oct4 protein and gene expression at either the individual cell or population levels. The depletion of Nanog in Cond 3 was significant even in comparison with Cond 2 in BASC-based protein quantification by immunostaining (p-value = 0.007 and adjusted p-value = 0.028 for Cond 2 vs. Cond 3), gene expression in qPCR (p-value = 0.006, adjusted p-value = 0.012), and RNA-Seq (p-value = 0.007, adjusted p-value = 0.010). In contrast, Oct4 expression was higher in Cond 3 than in Cond 2 in the manual thresholding of immunostaining (p-value = 0.021, adjusted p-value = 0.039 for Cond 2 vs. Cond 3), qPCR (p-value = 0.002, adjusted p-value = 0.005), and RNA-Seq (p-value = 0.017, adjusted p-value = 0.048).Contrary to the clear depletion in the intracellular molecular markers in Cond 2, the decrease in the FCM-based pluripotency ratio based on cell surface markers was not significant under these conditions, although there was a decreasing trend. Instead, FCM analysis with manual thresholding showed a decreased pluripotency ratio in Cond 3.Image AI model-buildingHere, we proposed the use of two image AI frameworks to estimate distinct pluripotency signals from different molecular assays by classifying certain image units into “pluripotency or not,” without the label for each corresponding training image. First, as an unsupervised algorithm, a one-class SVC was trained on cropped cell images or tiled images to detect anomalous images, where we expected the detection ability of the models on single-cell level morphology changes or cell population-level differences, such as cell sparseness and distribution, respectively (Fig. 4A). In the one-class SVC model-building pipeline using cropped cell images as training data, all FOV images from all samples of all conditions first underwent brightness tuning and then cell cropping or directly underwent cell cropping without brightness tuning. In the pipeline using tiled images as training data, the images of four adjacent FOVs were merged into one tile image to accommodate more cells and avoid cell-free training images, or FOV images were used directly as tiled image units. Input images were created with and without brightness tuning of the tiled images. Cell- or tile-based images were resized into 50,400 pixels in each image unit to be used as input data for dimensionality reduction using UMAP in three dimensions. One-class SVC models were trained using the resulting dimensions with different kernels (linear, polynomial, and radial basis function) and various values of the ν parameter.Figure 4Modeling frameworks of image-based pluripotency assessment. (A) Outlier detection-based classification approach using the one-class SVC approach. (B) ResNet-based semi-supervised approach to train the classifier using an existing supervised pluripotency classification model of mouse ESCs.Second, we employed a deep-neural network architecture and incorporated a prebuilt classifier model reported by Waisman et al.22 as a guide model to label images in the initial round of semi-supervised model training iterations (Fig. 4B). The guide model was intended to classify undifferentiated and early differentiating mESCs at the early onset of differentiation stimuli by supervised training of ResNet-50 architecture, using transmitted light microscopy images of mESCs maintained in LIF + serum, and those induced for early differentiation of mesodermal cells. Importantly, the authors confirmed that the resulting classification model has remarkable predictability for human iPSCs in classifying images of undifferentiated and differentiated culture conditions. In this study, we retrained the model with the ResNet-50 architecture in our own environment using the original training image dataset provided by the authors, which contains 2,134 training images and 400 validation images from undifferentiated and differentiated mESCs in culture. In our semi-supervised pipeline, FOV images (2528 × 2136) were first resized into 1264 × 1068 and 6 images (640 × 480) were cropped from each resized image to unify the image size into the training image of the guide model. The merged tile images were then input into the iterative semi-supervised ResNet model, with or without brightness tuning. In the first iteration, all cropped images were labeled as undifferentiated and differentiated groups using the guide model. Among the pseudo-labeled images, the top 300 images with the highest class probability in the two groups were selected as the labeled dataset. The labeled dataset was used to update the ResNet-18 model and create a pseudo-labeled dataset. In the following iterations (iteration numbers 2–10), the top 300 pseudo-labeled images from the latest updated ResNet50 classifier were added to the labeled dataset in each iteration, and the dataset was used to update the ResNet50 model in the following iteration. The unlabeled dataset contained either images from this study (14,400 images) or images from this study and the original training data of the guide model (1600 images).The predicted label for each input image unit for each sample was accumulated into the “pluripotency ratio for the sample,” which calculates the percentage of majority class in one-class SVC models and undifferentiated class in semi-supervised models. Unless otherwise noted, model predictions were made on FOV images at the last time point of image acquisition (i.e., time point 20 of Day 4), which was the closest time point to the following molecular experiments of pluripotency assessment.Model prediction, selection, and comparison with experimental molecular measurementsThe prediction results of the models with different frameworks, preprocessing steps, and parameters were compared with the experimental molecular measurements. To validate the consistency of the prediction ability for different experimental batches, we performed a cell culture assay with six replicates using control pluripotency conditions as Cond 1, followed by FCM analysis (Fig. 5A). Results of manual- and auto-gating-based FCM analysis, shown as black and gray bars, respectively, present distinct patterns in FCM-derived pluripotency ratios (Supplementary Fig. 4). Notably, replicates 5 and 6 showed lower pluripotency ratios using auto-gating. Intersample changes were moderate in FCM data with manual-gating and no visible differences were detected by the experts’ visual inspection for all six replicates.Figure 5Model selection and prediction of pluripotency status. (A) FCM analysis of six control samples treated with the same procedure as Cond 1. Black and gray bars indicate FCM-derived pluripotency ratios (i.e., Tra-1–60 and SSEA-4 double-positive ratios) by manual- and auto-gating, respectively. (B) Model variations were mapped in the UMAP dimension using the model prediction results of pluripotency ratios of 12 samples (Cond 1–4). Model type (cell-based, tile-based one-class SVC, or semi-supervised framework) is indicated by the shape of the data points. The red frame around the data points indicates that the predictions of the highlighted models satisfy the selection criteria. (C) The model prediction results of pluripotency ratios for Cond 1–4 and the six validation samples of control pluripotency conditions are represented as a heatmap (blue to red indicate low to high predicted pluripotency ratio). The color-coded panel besides each model indicates the clusters seen in Fig. 5C, and the cluster numbers correspond in both figures. The best-fit models selected by correlation with molecular pluripotency markers (Models I–III) are indicated. (D–E) Predicted pluripotency ratio of the selected three models for the four distinct conditions (D) and the validation 6 samples for control pluripotency conditions (E). (F) Time-series prediction of pluripotency ratio on the images of the first and the last time points of Day 1 and timepoints 1, 5, 10, and 20 (last) of Day 4.The model variations predicting pluripotency ratios for 12 samples under four distinct conditions with three replicates were mapped to the UMAP representation in Fig. 5B and are shown in the heatmap in Fig. 5C. Clear model clusters (clusters 1–7) emerged, depending mostly on the model framework. Models thought to reflect maintenance of pluripotency were highlighted with a red frame in the UMAP dimensions based on the following criteria: (1) the average predicted pluripotency ratio in Cond 1 and six validation control pluripotency samples was maintained at > 60% in each dataset and (2) sample-based variance value of predicted pluripotency ratio among all samples in Cond 1–4 was clearly detected as > 50. Based on the rationale that a model with sufficient predictive power for practical use might be robust to small differences in parameters, we identified three clusters (clusters 1, 6, and 7) in which models meeting these criteria had various parameter variations. The model variations that met the conditions within these clusters corresponded to cell- and tile-based one-class SVC (clusters 7 and 1) or a semi-supervised modeling framework (Cluster 6). The best-fit model for the molecular assessment of pluripotency was selected from each of the selected clusters based on the Pearson’s correlation of the predicted pluripotency ratio and immunostaining results (Supplementary Fig. 5). Notably, the correlations were calculated for the average value of each of the four conditions.The predictions of the three resulting models for the four distinct conditions are shown in Fig. 5D, and those for the validation of the six samples are shown in Fig. 5E. Model I, with a cell-based one-class SVC framework, selected from Cluster 7, had the highest correlation with Oct4 with manual thresholding (r = 0.998), which was also highly correlated with Oct4 gene measurements using qPCR and RNA-Seq. Model I displayed a high correlation with surface marker protein measurements by FCM with auto-gating (r = 0.912). Model II, from Cluster 1 and with the tile-based one-class SVC framework, had the highest correlation with Nanog measured by immunostaining with automated thresholding (r = 0.961), which also correlated with Nanog gene expression values. Model III, selected from Cluster 6, which is a collection of models with the semi-supervised framework, correlated highly with the expression of Nanog protein (immunostaining with automated threshold; r = 0.978) and Nanog gene. Models II and III exhibited high correlation with surface marker proteins measured by FCM with manual-gating (r = 0.908 and 0.938, respectively).In contrast to the validation control pluripotency samples, where each sample is associated with a matching image, Model I was highly correlated with surface marker protein measurements with manual-gating (r = 0.990, r2 = 0.980) and Model II was highly correlated with the same data as the auto-gating results (r = 0.901, r2 = 0.811). However, Model III did not have the predictability of FCM data with either gating strategy. This may be because the guide model referenced in the semi-supervised framework in Model III was built on top of the images of differentiation and undifferentiation conditions rather than heterogeneity under controlled pluripotency conditions. There were noteworthy differences between automated and manual thresholds in the quantification of protein intensities of immunostaining and FCM data, which are often debated24. For the immunostaining data, BASC-based automated thresholding of Nanog protein intensity was highly correlated with the gene expression profiles measured by RNA-Seq and qPCR. In contrast, the Oct4 protein had a high correlation with its gene expression profile when the intensity threshold was set manually at the valley of the bimodal peak. Although the robustness of quantification was not within the scope of this study, it is interesting to note that the unsupervised models that best-fit Nanog or Oct4 predicted automatic or manually gated FCM-based pluripotency ratios.Time-series prediction of PluripotencyWe applied the best models to images obtained at earlier time points to predict the onset of changes in pluripotent states represented in each model (Fig. 5F). For all of the three selected models, the pluripotency ratio was increased after the culture starts and plateaued by the last time point of Day 1. In addition, the predictions for replicates in each condition were relatively consistent, indicating that the models reliably captured the biological consequences reflected in the images. In Model I, which correlated with the Oct4 profile, the decrease in pluripotency ratio in all conditions became clear as late as day 4-time point 5, and a steep decline in Cond 2 was observed toward the last time point. Model II, which predicted the Nanog profile, identified a noticeable change in cellular states at the start of Day 4 in Cond 2, where the cells were stimulated by inactivated media from Day 2. Similarly, the changes in cellular states in Cond 2 in Model III were detectable at the onset of second-image capture. Conversely, for Cond 3, Models II and III detected extremely rapid changes in the cellular state immediately after stimulation by FBS on Day 4. For Cond 3, it is conceivable that Models II and III identify fluctuations, similar to those visually observable by humans, such as cell flattening and rapid increase in confluency over time. For Cond 1 and Cond 4, neither the model detected significant differences in protein and gene expression, nor were there significant fluctuations in cellular states observed. Notably, in Model II, drastic changes in the predicted pluripotency ratio were observed at an earlier onset in Cond 2 and Cond 3; thus, Model II may have the best potential to detect abnormalities in PSCs.Model prediction robustnessTo examine the robustness of the model prediction to the position within the microscopic field captured in the input images, we randomly sampled 10% of the files from all FOVs in each sample 100 times for model input and compared the predicted pluripotency ratio with the prediction using all FOVs as input images (Fig. 6). The mean pluripotency ratio among the 100 samples was consistent with the predicted value for all images (Pearson’s correlations and regression coefficients were > 0.99 for all three models). In contrast, the deviations of random inputs were larger in Model II (tile-based one-class SVC model) but significantly smaller in Model I (cell-based one-class SVC model). This may be due to the larger size, and thus, the smaller number of input images in Model II, which takes merged tiles, and the smaller size and large number of input images in Model I, which takes individual cell images.Figure 6Robustness in pluripotency ratio prediction of the selected models for FOV selection. The X-axis shows mean and standard deviation of 100-times repeated predicted pluripotency ratios using 10% randomly selected FOVs. The Y-axis shows the predicted pluripotency ratio of each model using all FOVs of each sample.We have also compared the average Oct4 and Nanog intensity for each FOV from the immunostaining data for all samples and the model prediction (Model III-Tile-based unsupervised model) which uses tiles in the FOV as an input (Supplementary Fig. 6). The results showed a good correlation, especially in Nanog, between actual averaged intensity values and predicted distance from the model’s origin, although the model has not learnt any molecular information nor the differentiation/undifferentiation labels.Finally, the applicability of the models was tested using three different iPS cell lines, 1231A3, ND50018 and ND50019 cultured in control conditions in the same well-plate settings with the validation study in Fig. 5a, and the cells were assessed by qPCR (POU5F1 and NANOG) and flow cytometry (SSEA-4 and Tra-1-60) analysis. The cell line 1231A3 was derived in the same institution (Center for iPS Cell Research and Application, Kyoto University, Japan) with 201B7, which was used throughout in this study, but the origin cell type is peripheral blood, which is different from fibroblast-derived 201B7. ND50018 and ND50019 were both derived in NIH Center for Regenerative Medicine (CRM) in USA, but the origin cell types are different—umbilical cord blood and fibroblast cells, respectively (Supplementary Table 6). Overall, the expressions of POU5F1 and NANOG were not synchronized except for ND50019 (correlation coefficient r in 1231A3, ND50018, ND50019 were -0.71, 0.05 and 0.79, respectively), and the FCM results for automatic and manual gates also had different patterns (correlation coefficient r in 1231A3, ND50018, ND50019 were -0.54, 0.3 and -0.04, respectively), which makes generalization of the prediction performance more complex (Supplementary Table 7). Even under such circumstances, the Semi-supervised model (Model III) showed relatively good correlation with the Oct4 expression and FCM results (manual-gating) where the correlation coefficients were ranging from 0.56 to 0.73 (Oct4) and 0.47 to 0.78 (FCM). The average double positive (SSEA-4 + /Tra-1-60 +) rate for 1231A3 was commonly below 90% for auto-gated and manual-gated FCM analyses, which was also predicted by Model I (cell-based). and Model III. On the other hand, Model II (tile-based, unsupervised) predicted very low pluripotency ratios for other cell lines.

Hot Topics

Related Articles