Investigating quantitative histological characteristics in renal pathology using HistoLens

Human data collection followed a research protocol approved by the IRBs at the University of Florida, Gainesville, FL and Johns Hopkins Medical Institutions, Baltimore, MD and University of Michigan, Ann Arbor, MI. Animal studies were performed in accordance with protocols approved in advance by the Institutional Animal Care and Use Committee at the National Institutes of Health, Bethesda, MD.Software overviewHistoLens is a stand-alone desktop application for hand-engineered image feature analysis, quantification, classification, and visualization for digital pathology images (Fig. 1). HistoLens accepts as input digital histology whole-slide images (WSIs) along with WSI micro-compartmental annotations at varying scales (cell nuclei to larger structures, such as glomeruli, in the context of kidney pathology) in Aperio ImageScope Extensible Markup Language (XML) or JavaScript Object Notation (GeoJSON/JSON) QuPath annotation format. Annotations can be generated manually or through the use of deep learning models implemented in cloud platforms such as Histo-Cloud or FUSION (Functional Unit State Identification for WSIs)16,17. Software applications that support the recording of digital pathology annotations include both open-source applications, such as QuPath, Automated Slide Analysis Platform (ASAP), and commercial applications, including Pathomation (Antwerp Belgium), and Aperio ImageScope (Leica Biosystems, Wetzlar Germany) among others18,19,20. If no annotations are found in the same folder as the WSIs, users can also use the annotation tool included in HistoLens which allows users to interactively annotate multiple different types of structures. The resulting annotations can then be saved in either Aperio ImageScope XML format, GeoJSON format, or Histomics format which is employed in Histo-Cloud.Figure 1HistoLens workflow. A new experiment using HistoLens is initialized by providing the file paths to a directory of whole slide images (WSIs), together with their associated annotation files and slide metadata. User-provided file paths and a record of analyses are saved into an experiment file that can be quickly loaded, so as to pick up where a previous analysis ended.Computational analysis in HistoLens includes interactive sub-compartment segmentation (extracting biologically relevant sub-compartments in each image; for example, nuclei, eosinophilic regions, and luminal space), color normalization, feature extraction, and feature ranking, all linked to patient level or tissue level labels21. Sub-compartment segmentation is performed using an interactive procedure that minimizes the amount of user parameters required and therefore simplifies the image analysis process. These parameters include colorspace, intensity threshold, minimum object size, and a splitting parameter applied only to overlapping nuclei. Users can also modify sub-compartment segmentation parameters for individual slides if there is variation in staining or imaging conditions within their dataset. If a user is confident that the staining characteristics are consistent across their slides, they may also bypass the remaining slides and project their current segmentation parameters onto subsequent slides prior to feature extraction. The resulting data (sub-compartmental segmentation parameters and their associated quantitative hand-engineered features) can be analyzed using HistoLens for structure classification and interactive visualization for biological discovery.The main window of HistoLens consists of three major panels (Fig. 2). The first panel presents an annotated microanatomic structure pulled from the WSI, together with any hand-engineered feature visualizations that are displayed as an overlaid heatmap contained within a rectangular region of interest (ROI). The annotated structure in this display can be changed in three ways. Users can (1) select a different annotated image pulled from the WSI using the accompanied annotation file by selecting its label in the list below the displayed image, (2) click the “Next Image” or “Previous Image” buttons, or (3) tap the right or left arrow keys on the keyboard.Figure 2HistoLens main window with select interactive components highlighted. (A) Shown is the main image viewing area. Here the current microanatomic structure pulled from the whole slide images (WSI) in accordance to the annotation in the provided annotation text file is displayed, as well as the feature visualization (with adjustable transparency) and the relative feature intensity region of interest (ROI). (B) The toolbox, including feature distribution plot, statistical summaries, annotations, and modeling capabilities, is shown. (C) The selected structure’s feature value relative to the whole dataset is indicated in the Feature Distribution Plot. (D) Hand-engineered features are separated into a hierarchy by sub-compartment and feature type. Users are able to either select individual features in the specific feature column for visualization on the right panel (B) or they are able to combine features from sub-compartment and feature type, or by adding select features to a custom list (E). (E) Custom list where user can add features across sub-compartments and categories for specific analysis.The second panel, located under the main visualization window, is a series of lists with titles, “Sub-compartment,” “Feature Type,” “Specific Feature,” and “Custom List.” By selecting an item in the “Sub-compartment” list, users can access the relevant sub-compartments of interest for quantifying associated hand-engineered features and feature visualizations. The “Feature Type” list contains the categories of features (e.g., size, morphology, texture, color) that are included in that sub-compartment. In the “Specific Feature” list, users can make specific image feature selections. Clicking the “Add to Custom” button, located beneath “Custom List,” adds the selected specific feature to the “Custom List” list. Adding features to this list allows users to select combinations of hand-engineered features that are not in the same sub-compartment or feature type. Located next to the four lists are four buttons labeled, “View Sub-Compartment Features,” “View Feature Type,” “View Specific Feature,” and “View Custom List.” Clicking on these buttons controls which features are currently under shown in the visualization panel as discussed next.The third panel is a toolbox which allows users to generate figures, build classification models, and generate image annotations for downstream analysis. The major tools are separated on tabs labeled, “Feature Distribution Plot,” “Statistical Measures,” “Classification Models,” “Relative Feature Intensity,” “Add Annotations,” and “Feature Definitions.” Details of these tabs are provided below.Feature distribution plot This tab provides plots of the distribution of the current/selected hand-engineered features across a dataset. Labels on the plot show users the location of the selected image in the main display relative to the rest of the dataset. When a single feature is selected using the “View Specific Feature” button as part of the second panel, the data distribution for that feature is presented as a violin plot in which the points correspond to the annotated structures in the dataset.When selecting more than one feature using the “View Feature Type,” “View Sub-Compartment Features,” or “View Custom List” buttons, scatterplots are used to show either the differences between two features or between the first and second principal components of the dimensionally reduced feature set. Included in this tab are a second set of tabs, located below the feature distribution axes, for selecting the data that included in the plot. These tabs include the following five options: switching how each point is labeled, selecting regions in the feature space for making qualitative comparisons between sub-clusters displayed in the Feature Distribution Plot, labeling the data scatter according to slide metadata, adding additional labels from an external file or classification model, and removing outliers (with a table displaying the image name removed). Using these tools, trends in the current/selected features can more easily be identified and evaluated.Statistical measures The second tab provides statistical comparisons, including a summary of descriptive statistics for the current feature (minimum, maximum, mean, median, etc.) identified by current labels in the feature distribution plot. When single features are viewed, a statistical test is implemented to provide users with a quantitative measure of statistical significance. The test implemented depends on the number of classes in the current labels. For two classes, a Student t-test is performed and for three or more classes, a one-way ANOVA is performed22,23. For multiple features, the silhouette index for each class is displayed to quantify data clustering relative to other classes24. Principal component analysis is also provided in the form of percentage variance explained in the first two principal components, together with the coefficients for each feature. To view the individual contributions of specific features on the principal components, users are able to select features for a biplot (a two-variable scatterplot) on the feature distribution plot25. Arrows are used to indicate the direction and magnitude of coefficients for each feature.Classification models Using the features displayed in the feature distribution plot, a user can implement several classification models, including decision trees, discriminant analysis, a naïve Bayes classifier, and a simple neural network26,27,28,29. The performance of each of these models for a particular image set is displayed using a confusion matrix for categorical labels and a violin plot of error for continuous labels. A performance report is also generated that includes other performance metrics of the model, including accuracy, sensitivity, specificity, F1 score (defined as the harmonic mean between precision and recall, with values ranging from 0 to 1), and the number of structures in each slide that are predicted to be from a certain class30. Saving the model, by clicking the “Save Model” button, generates a new directory in the slide directory containing the model object and the performance report (‘.xlsx’ file). This report also includes the features used to generate the model and predicted labels for each member of the test set. Additional information on the Classification Models workflow is provided in Supplemental Document 1.Relative feature intensity This tab provides users with a quantitative description of the intensity, measured here as the area of the hand-engineered feature visualization present in the rectangular ROI in the first panel of each feature. For individual features, this plot displays the distribution of intensities contained within the feature visualization for that image. For multiple features, the quantification method instead shows the relative content for each feature in the enclosed region compared to the rest of the image. Additionally, the relative values displayed in the histogram are recorded along with feature name and feature rank in table format. Users can use this tab to locate which regions are influential towards highly ranked features as determined using a chi-square test (default) when new labels are imported through the “Add Labels” tab31.Adding annotations In this tab, users can define custom classes and annotate more specific regions in the current image. These regions are saved locally as binary masks for each user-specified class where annotated pixels have a value of one and non-annotated, background pixels have a value of zero. The resulting folder structure separates out annotated images from different slides with subfolders corresponding to the name of the annotated class in that mask. These binary masks can be used in downstream deep-learning pipelines for segmenting different types of lesions or specific areas within annotated compartments. This tab also allows users to transfer the current image annotation to another annotation layer in the original annotation file associated with the WSI. This feature can be used to assign structures to a separate group within the study or to create a new structure in the WSIs.Feature definitions The last tab contains written definitions for the current feature or features under observation as well as a description of how each feature’s visualization was generated (see Supplemental Table 1). A key for interpreting the columns of Supplemental Table 1a is provided in Supplemental Table 1b.Experimental designWe analyzed the performance of HistoLens to recognize biologically significant characteristics of histological structures in two distinct experiments: one using a human dataset consisting of DN, MCD, and AN cases, and the other using a murine dataset consisting of HIVAN and wild-type (WT) control mice.Tissue sectioning, staining, & imaging Tissues were sectioned at 2–3 µm thickness for staining and imaging. The tissues were stained primarily with PAS and hematoxylin counterstain. The slides were scanned using a brightfield microscopy WSI scanner, NanoZoomer 2.0-HT (Hamamatsu, Shizuoka, Japan). The pixel resolutions of the images used were 0.26, 0.25, and 0.51 µm/pixel.Human DN, MCD, and AN In the first experiment, we examined whether and how HistoLens distinguishes features between nodular sclerotic glomerular pathologies (DN and AN) and histologically normal glomeruli (MCD). This particular comparison was selected as the visual appearances of glomeruli in DN and AN are readily distinguishable from MCD using PAS staining visualized by light microscopy (Fig. 3)32,33. Additionally, there already exists a well-described general set of distinct diagnostic features that identifies the presence of DN and AN33. The primary feature of significance is the extent and quality of the glomerular mesangium. For glomeruli manifesting nodular diabetic glomerulosclerosis, nodular mesangial matrix expansion is apparent, in extreme cases forming Kimmelstiel–Wilson nodules33. In contrast, amyloid glomerulopathy, also a nodular pattern, is characterized by deposition of non-matrix material and along capillary walls34. By light microscopy, amyloid is generally PAS-weak compared to strong PAS-positivity in nodular diabetic glomerulopathy.Figure 3Respective nine example glomeruli from cases of (A) MCD, (B) DN, and (C) AN. (D) Arrows highlighting regions of nodular glomerulosclerosis in DN (PAS-strong) and amyloid nephropathy (PAS-weak) that are not present in MCD. Scale bar indicates 50 µm.In HistoLens, the distinguishing features among DN, AN, and MCD can be grouped into two categories, mesangial size features and mesangial color features. For each of these categories, several metrics can be quantified to identify more detailed and objective differences among the samples. These features include measures of PAS + area (measured using number of pixels contained within segmented PAS + regions) and PAS + region color statistics (mean and standard deviation of red, green, and blue [RGB] values).Human lupus nephritis (LN) and IgA nephropathy To demonstrate the ability of HistoLens to extend to other diseases and stain types, we further examined two additional datasets and include that abridged analysis in Supplemental Documents 3 and 4. The first dataset contained PAS-stained slides from patients with Lupus Nephritis (LN) who exhibited different responses to treatment (Complete Response (CR) and No Response (NR)). The next dataset consisted of Silver-stained slides from patients with IgA Nephropathy (IgAN), where HistoLens was employed to quantify differences in silver deposition between patients.Murine HIV-associated nephropathy (HIVAN) model Next we used HistoLens to characterize differences between diseased and WT groups in a mouse study set. The mouse cohort was an HIVAN model. In this model, Tg26 mice from the FVB/N strain contain a transgene with a gag-pol-deleted HIV-1 genome and manifest a sclerosing glomerulopathy (Fig. 4)35. We sought to test whether quantitative assessment and visualization in HistoLens would be able to achieve a sufficiently granular analysis of the effects of experimental factors on tissue compartments. See Supplemental Table 2 for dataset composition details.Figure 4Example glomeruli from the HIV-associated nephropathy (HIVAN) dataset. (A) Nine example glomeruli taken from wild-type mice. (B) Nine example glomeruli taken from HIVAN mice. (C) Example of global and focal glomerulosclerosis in HIVAN mice compared with a wild-type mouse glomerulus. Scale bar indicates 50 µm.Sources for other quantitative methods applied in this work, including references and MATLAB implementation documentation, can be found in Supplemental Table 3.Approval for animal experimentsAll animal studies were performed in accordance with protocols approved by the Laboratory of Animal Science Section (LASS) at NIH NIDDK, and are consistent with federal guidelines and regulations and in accordance with recommendations of the American Veterinary Medical Association guidelines on euthanasia. This study is reported in accordance with ARRIVE guidelines.Approval for human experimentsHuman data collection followed a protocol approved by the Institutional Review Board at Johns Hopkins, University of Florida, and University of Michigan prior to commencement. All methods were performed in accordance with the relevant federal guidelines and regulations. Participants were required to be over 18 years of age, and with a diagnosis of chronic kidney disease that required renal biopsy. Special populations (vulnerable) such as minors, pregnant women, neonates, prisoners, children, and cognitively impaired patients were not included. All patients provided written informed consent.

Investigating quantitative histological characteristics in renal pathology using HistoLens

Zero-shot transfer of protein sequence likelihood models to thermostability prediction

Poisoning medical knowledge using large language models

Semi-supervised recognition for artificial intelligence assisted pathology image diagnosis

Chemistry wordoku #062 | Puzzle

Multi-output prediction of dose–response curves enables drug repositioning and biomarker discovery

Hot Topics

Zero-shot transfer of protein sequence likelihood models to thermostability prediction

Poisoning medical knowledge using large language models

Semi-supervised recognition for artificial intelligence assisted pathology image diagnosis

Related Articles

Balancing Act: Pregnancy and Bipolar Disorder

Cohesion at the cellular level: flexible yet stable

Gut bacteria influence responses to immunotherapy in patients with asbestos related cancer

Quick Links

Must Read

Zero-shot transfer of protein sequence likelihood models to thermostability prediction

Poisoning medical knowledge using large language models

Semi-supervised recognition for artificial intelligence assisted pathology image diagnosis

Chemistry wordoku #062 | Puzzle

Popular Articles

Zero-shot transfer of protein sequence likelihood models to thermostability prediction

Poisoning medical knowledge using large language models

Semi-supervised recognition for artificial intelligence assisted pathology image diagnosis