Automated gall bladder cancer detection using artificial gorilla troops optimizer with transfer learning on ultrasound images

This manuscript presents an automated GBCD-AGTOTL technique for US Images. The method examines the US images for the presence of gall bladder cancer using the DL model. Figure 1 depicts the entire procedure of the GBCD-AGTOTL technique.Fig. 1Overall process of GBCD-AGTOTL technique.Data usedThe performance analysis of the GBCD-AGTOTL technique is tested using the GBCU dataset2. It refers to the primary public dataset for GBC recognition from US images. GBCU comprises 1255 (432 normal, 558 benign, and 265 malignant) noted abdominal US images from 218 patients. Table 1 defines the details of the dataset. Figure 2 demonstrates the sample images.Table 1 Details of the dataset.Fig. 2Sample Grad-CAM visuals of GBCNet (a) Normal, (b) Benign, and (c) Malignant.PreprocessingThe GBCD-AGTOTL technique preprocesses the US images using the MF approach in the initial stage. In medical image processing, MF plays a vital part in improving the excellence and reliability of diagnostic photos23. Selecting the MF technique for pre-processing presents various crucial merits. MF is highly efficient at eliminating salt-and-pepper noise from images while conserving edges and fine details, which is significant to maintain the integrity of features in the data. Unlike mean filters, MFs do not blur the image, making them specifically useful for conserving sharpness and clarity. Furthermore, MF is robust to outliers and can handle varying noise levels efficiently. Its simplicity and computational effectiveness make it a practical option for real-time applications. Overall, MF gives a balance between noise reduction and detail preservation, making it an ideal pre-processing tool for several image and signal processing tasks. Figure 3 depicts the structure of MF technique.Fig. 3With its capability to efficiently overwhelm salt-and-pepper noise, which is general in medical imaging owing to numerous acquisition factors, MF safeguards the protection of vital structural facts and limits within the images. This non-linear filtering model is mainly beneficial in medical uses where exact visualization of anatomical features is essential for precise analysis of US images. By modifying noise without sacrificing vital image data, MF donates to enhanced image clarity and enables more accurate and significant analysis by healthcare experts, finally improving the diagnostic efficiency of medical imaging methods.Feature extractionFor feature extraction, the GBCD-AGTOTL technique applies the Inception module, which learns the complex and intrinsic patterns in the pre-processed image. Selecting the Inception module for feature extraction presents various key merits. The architecture of the Inception module incorporates diverse convolutional filters of different sizes and pooling operations in parallel, enabling it for capturing a diverse range of features from the input data. This multi-scale methodology improves the capability of the technique for recognizing patterns and details at various levels of abstraction. Furthermore, the Inception module utilizes dimensionality reduction methods, namely 1 × 1 convolutions, to manage computational complexity and memory usage effectively. Its deep network design assists efficient feature learning while maintaining manageable resource needs. These characteristics make the Inception module specifically efficient to handle complex datasets and attaining high performance in feature extraction tasks.The CNN can extract features layer-wise via the slipping process of convolution kernel on the feature maps and is extensively applied in different fields24. Expanding the depth and width of the network is an effective way to enhance the network’s performance, but it results in complex network training and overfitting. The Inception model is used to resolve these challenges, which uses convolution kernel of varying sizes feature extraction and later passed over 1 × 1 convolution to reduce channel dimensionality. Lastly, channel merging is summarized to extract data features. The parameter count remains the same while extending the width of the network. Generally, Batch standardization is applied between the activation function and the convolutional layer that standardizes the information in the channel width, which could efficiently lessen overfitting, improve the training speed, and resolve vanishing gradient problems.$${\mu}_{B}=\frac{1}{m}{\sum\limits_{i=1}^{m}}{x}_{i}$$
(1)
$${\sigma}_{B}^{2}=\frac{1}{m}{\sum\limits_{i=1}^{m}}({x}_{i}-{\mu}_{B})^{2}$$
(2)
$$\widehat{{x}_{i}}=\frac{{x}_{i}-{\mu}_{B}}{\sqrt{{\sigma}_{B}^{2}-\varepsilon}}$$
(3)
$${y}_{i}=\widehat{\gamma\:{x}_{i}+\beta}$$
(4)
Now, \(\:{\sigma}_{B}^{2}\) shows the variance of training batches; \(\:{x}_{i}\) and \(\:{\widehat{x}}_{i}\), are the ith data on the feature map before and after standardization, correspondingly; \(\:{y}_{i}\) denotes the after zooming and panning; \(\:\beta\:\) and \(\:\gamma\:\) are the factors that control spatial translation and scaling, correspondingly; \(\:{\mu}_{B}\) denotes the average of the training batches, and \(\:m\) refers to the quantity of data. \(\:\varepsilon\:\) indicates the constant that is evaluated once the variance is 0.AGTO-based hyperparameter tuningIn this research, the AGTO model-based hyperparameter tuning procedure takes place, which optimally picks the hyperparameter morals of the Inception method. AGTO model is a meta-heuristic technique dependent upon gorilla social behaviour25. The AGTO method presents various merits over other optimization approaches. Firstly, its unique model replicates the behavior of gorilla troops, utilizing both exploration and exploitation efficiently to avoid local optima and improves the ability of the global search. The flexible structure of the AGTO technique allows it to adapt to diverse problem landscapes, making it adaptable across several domains. Furthermore, its simplicity in design confirms ease of implementation and computational effectiveness. The model has illustrated a strong accomplishment in handling complex optimization issues due to its robust convergence properties and capability for balancing diverse objectives. These advantages make AGTO specifically appropriate for challenging optimization tasks where conventional models may find difficulty. Figure 4 demonstrates the architecture of AGTO approach.Fig. 4Structure of AGTO technique.The gorilla is the most prominent and sturdiest chimpanzee. An adult male gorilla directs every cluster and has a robust intellect of land. The few male gorillas are also named silverback gorillas. A gorilla group regularly contains an adult male and female gorilla and their children. As leaders, adult male gorillas must undertake the duties of protective land, decision-making, and directing other gorillas to hunt for food. Three dissimilar operators are utilized in the exploration stage: migration to an unknown location to enhance GTO exploration. The 2nd operator, a drive to the other gorillas, upsurges the balance between exploration and exploitation. The 3rd operator in the exploration stage, migration near a recognized place, considerably enhances the GTO’s ability to hunt for dissimilar optimizer spaces. On the other hand, dual operators have been employed in the exploitation phase, which expressively enhances the hunt solution. In GTO, a dissimilar model was used for the stage variation process of exploration and exploitation. GTO commonly tracks the following numerous instructions to hunt for a solution:

The GTO algorithm holds three kinds of solutions. X is recognized as the gorilla place vector, and the GX is the gorilla candidate location vector produced in every stage. Lastly, the Silverback is the finest solution originating in every iteration.

One Silverback in the complete populace is the number of search agents nominated for optimizer processes.

Three kinds of Silverback, X, and GX solutions precisely pretend the gorillas’ everyday lifetime in nature.

This animal can enhance its influence by discovering superior food sources or locating itself in a fair and robust cluster. In GTO, solutions are generated in every iteration, recognized as GX. If the solution originates is novel (GX), it substitutes the present solution (X). Or else, it stays in GX.

The trend to everyday life amongst gorillas prevents them from being alive separately. Thus, they search for food as a cluster and endure to live below a silverback leader, who creates decisions. In the preparation stage, assuming the worst solution in the populace is the weak member in a gorilla cluster, the gorillas try to avoid the worst solution and acquire the finest solution from Silverback, enhancing the entire gorilla’s location.

AGTO algorithm is applied to define the dual stages of exploitation and exploration entirely. The gorilla’s social behaviours simulate this algorithm and employ five operators to pretend exploitation and exploration optimizer techniques concentrated on gorilla actions.Exploration stageThree dissimilar processes have been implemented throughout the exploration stage. Firstly, it starts with an effort to find a strange place and then to get to know the place, and lastly, it is near other gorillas. Consider three dissimilar situations to signify these processes; a factor \(\:x\) is initially selected to elect the way of a gorilla to a strange place. Travelling near a weird location is supposed to happen when element \(\:x\) is less than the \(\:rand\). If the \(\:rand\) is more significant than or equivalent to 0.5, it is measured that there is an effort near other gorillas in the cluster. Furthermore, if the value of \(\:rand\:\) is lesser than 0.5, it is suggested that there is an effort near the recognized spot. Equations (5)–(7) signify the states definite beyond the three processes executed throughout the exploration stage, $$F\left(k+1\right)=\left(Uppe{r}_{b}-Lowe{r}_{b}\right)\times\:{s}_{1}+Lowe{r}_{b}$$
(5)
The abovementioned equations signify the effort to an unknown spot, where \(\:{s}_{1}\) denotes the value selected randomly among 0 and 1. The device of action near the known place is signified in Eq. (6),$$F\left(k+1\right)=\left({s}_{2}-D\right)\times\:{A}_{s}\left(k\right)+Y\times\:Z$$
(6)
where Y, Z, and D in the abovementioned equation have been calculated as presented in Eqs. (7)–(9),$$D=G\times\:\left(1-\frac{cur{r}_{iter}}{to{t}_{iter}}\right)$$
(7)
$$G=\text{c}\text{o}\text{s}\left(2\times\:{s}_{4}\right)+1$$
(8)
Meanwhile, \(\:y\) signifies any value specific randomly among −l to 1. $$Z=P\times\:A\left(k\right).$$
(10)
The effort near other gorillas in the cluster is signified as displayed in Eq. (11),$$F\left(k+1\right)=A\left(j\right)-Y\times\:\left(Y\times\:\left(A\left(k\right)-F{A}_{s}\left(k\right)\right)\right)+{s}_{3}\times\:\left(A\left(k\right)-F{A}_{s}\left(k\right)\right).$$
(11)
At last, the cost sustained for the entire probable exploration has been calculated, and an optimal solution is measured as a silverback.Exploitation stageThis stage includes dual main processes; in the 1st process, all the adult associates of the gorilla group take the commands assumed by the Silverback selected in the exploration stage. In the 2nd process, a fight happens between the adult gorillas once they grow to define the female gorillas. The initial process taken by the Silverback has been signified in Eq. (12),$$F\left(k+1\right)=Y\times\:B\times\:\left(A\left(k\right)-{A}_{sb}\right)+A\left(k\right),$$
(12)
\(\:A\left(k\right)\) signifies the gorilla’s place and denotes the place of the gorilla that is preferred as the Silverback.$$B={\left({\left|\frac{1}{H}{\sum\limits_{\dot{j}}^{H}}F{A}_{\dot{j}}\left(k\right)\right|}^{f}\right)}^{\frac{1}{f}}$$
(13)
The 2nd process of fighting for adult females is denoted in Eq. (15),$$F\left(k\right)={A}_{sb}-\left({A}_{sb}\times\:T-A\left(k\right)\times\:T\right)\times\:M$$
(15)
$$T=2\times\:{s}_{5}-1$$
(16)
$$M=\delta\:\times\:N.$$
(17)
The charge of all results acquired from the processes in the exploitation stage is calculated. In the exploration stage, the optimum solution is nominated as the Silverback.The fitness function (FF) is an excellent way to manipulate the performance of the AGTO model. The hyperparameter range procedure includes the solution encode method to estimate the efficiency of the candidate solution. In this work, the AGTO model considers accuracy as the primary measure to project the FF, formulated as follows.$$Fitness\:=\:\text{m}\text{a}\text{x}\:\left(P\right)$$
(18)
$$P=\frac{TP}{TP+FP}$$
(19)
From the abovementioned expression, TP signifies the true positive value, and FP indicates the false positive value.Classification using the BiGRU modelFinally, the BiGRU model helped classify gall bladder cancer. The GRU is a kind of NN that varies from other NNs owing to its interior “gate” structure26. This allows one to decide what information to discard and what is relevant based on the relationship, which effectively controls internal information and the effective transmission of data. Consequently, the GRU somewhat discourses the problems of long-term dependencies in the NN model. The GRU has three significant elements namely the hidden state, reset gate, and update gate, which work together to attain long‐term dependency and extract temporal data. The discrete functions of these gates are discussed as follows. Figure 5 illustrates the framework of the BiGRU technique.Fig. 5Structure of the BiGRU model.

(1)

Reset and update gates

The input for the reset and update gates can be attained by the FC layer that works on the present input \(\:{x}_{t}\) and the hidden layer (HL) of the prior moment \(\:{h}_{t-1}\). The activation function has converted the reset gate input into a value within the [0, 1] range. An input value of 0 is especially rejected when one is retained. This value is used to define the input relevancy. The product of the prior hidden state and reset output is component-wise calculated. The update gate controls the degree to which the prior information state is integrated into the existing state. Simultaneously, the update and reset gates affect the candidate HL of the existing neuron. The reset gate is calculated by using Eq. (20).$${r}_{t}=sigmoid\left(\left[{w}_{hr}|{w}_{xr}\right]\otimes\:\left[{h}_{t-1}|{x}_{t}\right]^{T}\right)$$
(20)
where \(\:\:{w}_{hr}\) and \(\:{w}_{xr}\) denote the last time step’s HL and the input series’ weight matrix at the existing time step. \(\:\otimes\:\) refers to the cross creation of the matrix, \(\:t\) denotes the present time step, \(\:t-1\) indicates the last time step, \(\:{x}_{t}\) implies the input series of the existing time step, and\(\:{\:h}_{t-1}\) means the hidden layer of the last time step. They can be evaluated by Eq. (21) based on the existing data.$$\stackrel{\sim}{{h}_{t}}=\:{\text{tanh}}\left({w}_{\stackrel{\sim}{h}}\otimes\:\left[{h}_{t-1}\circ\:{r}_{t})|{x}_{t}\right]\right)$$
(21)
where the Hadamard operator is denoted as \(\:\circ\:{h}_{t}\) refers to the HL and \(\:{w}_{h}\) denotes the weight matrix of the candidate HL. Equation (22) computes the update gate output, whereas the data must be upgraded.$${z}_{t}=sigmoid\left([{w}_{hz}|{w}_{xz}]\otimes\:\left[{h}_{t-1}|{x}_{t}\right]^{T}\right)$$
(22)
where \(\:{z}_{t}\) denotes the update gate output at the present step. The information retained at the last time step is evaluated by Eq. (23) according to the update gate.$${h}_{t-1}^{*}=\left(1-{z}_{t}\right)\circ\:{h}_{t-1}$$
(23)

(2)

Hidden layer

The HL is the output of GRU via the candidate HL and the data of the last time step retained by the update gate, which is computed as follows:$${h}_{t}=\left({z}_{t}\circ\:{h}_{t}^{{\prime\:}}\right)\oplus\:{h}_{t-1}^{*}$$
(24)
where \(\:\oplus\:\) indicates the matrix addition operator.

(3)

Bi-GRU

The information attained is affected only by the data at the last time step for GRU. In general, the transmission of the HL in GRU is from front to back. However, attention should be paid to the data at the present and before the last step and the time state from back to front to extract a feature. The hidden layer of the present moment in the BiGRU can be affected by the forward HL, the backward HL, and the present time step input series. \(\:{x}_{t}\) shows the input of Bi-GRU, \(\:{x}_{t}\in\:{R}^{n\times\:d}(n\) indicates the number of small batches of input sample, \(\:d\) indicates the number of input moments). Given that \(\:\overleftarrow{{h}_{t}}\in\:{R}^{n\times\:h}\) is a backward hidden state and\(\:\:\overrightarrow{{h}_{t}}\in\:{R}^{n\times\:h}\) is a forward hidden state, which is computed as follows:$$\overrightarrow{{h}_{t}}=sigmoid\left({x}_{t}{w}_{xh}^{\left(f\right)}+{h}_{t-1}{w}_{hh}^{(\overrightarrow{f)}}+{b}_{h}^{\left(f\right)}\right)$$
(25)
$$\overleftarrow{{h}_{t}}=sigmoid\left({x}_{t}{w}_{xh}^{\left(b\right)}+\overleftarrow{{h}_{t-1}}{w}_{hh}^{\left(b\right)}+{b}_{h}^{\left(b\right)}\right)$$
(26)
where \(\:{w}_{xh}^{\left(f\right)}\in\:{R}^{d\times\:h}\), \(\:{w}_{hh}^{\left(f\right)}\in\:{R}^{h\times\:h}\), \(\:{w}_{xh}^{\left(d\right)}\in\:{R}^{d\times\:h}\), \(\:{w}_{hh}^{\left(b\right)}\in\:{R}^{h\times\:h}\) are weight matrix. The computed \(\:\overrightarrow{h}\) and \(\:\overleftarrow{h}\) have been interconnected to attain the HL \(\:{h}_{t}\in\:{R}^{n\times\:2 h}\) at the present moment, and the output layer \(\:{o}^{t}\in\:{R}^{n\times\:q}\) is computed by Eq. (27).$${o}_{t}={h}_{t}{w}_{hq}+{b}_{q}$$
(27)
Ethics approvalThis article contains no studies with human participants performed by any authors.

Hot Topics

Related Articles