Dynamic mutation based glowworm swarm optimization with long short-term memory approaches for thyroid nodule classification

Objectives: To design an efficient approach for thyroid nodule classification with higher true positive rate.Methodology and statistical analysis: The proposed system designed as a Dynamic Mutation based Glowworm Swarm Optimization with Long-Short Term Memory (DMGSO with LSTM) scheme for thyroid nodule classification. In this proposed research work, input thyroid images are preprocessed by using Dynamically Weighted Median Filter (DWMF). The preprocessed images are segmented with the help of Region based Active Contour scheme. An Improved Local Binary Pattern (ILBP), Grey Level Cooccurrence Matrix (GLCM) and Histogram of Oriented Gradient (HOG) features are extracted from segmented image. Then the optimal features are selected by using Dynamic Mutation based Glowworm Swarm Optimization (DMGSO) algorithm. Finally, the Long-Short Term Memory (LSTM) scheme is utilized for classifying the thyroid nodule. Findings: The experimental results show that the proposed system achieves better performance compared with the existing system in terms of accuracy, precision, recall and f-measure.


Introduction
Thyroid nodule is a solid lump that can grow in thyroid gland. It can be a single lump or cluster of nodules. Research studies indicated that 60% of the people affected by thyroid nodules. Fine Needle Aspiration Cytology (FNAC) is popular and widely used for diagnosing thyroid nodules because of its higher sensitivity when compared to other methods (1) . FNAC is faster and in expensive method. It provides important information for differentiating benign from malignant nodules which reduces unnecessary surgeries. Recently, medical imaging techniques including Ultrasound (US) imaging and Computerized Tomography (CT) are being used for diagnosing thyroid nodules with greater https://www.indjst.org/ accuracy (2) . US imaging modality is non-invasive, low cost and does not use any ionization radiation. Compared to CT, US imaging techniques are widely used due to its size and portability. US imaging technique is operator dependent, images are analyzed manually by a sonographer or physician. Manual analysis is subjective, even experienced persons may provide different diagnosis. To solve the aforementioned issues, Computer Aided Diagnosis (CAD) system have been proposed to discriminate benign from malignant nodules Feature extraction plays a significant role in classification task. Researchers have attempted to develop CAD systems using various feature extraction and classification techniques (3,4) . Most of the researchers have used texture features and Support Vector Machine (SVM) for diagnosing thyroid nodules (5) . Deep learning neural networks have been successfully applied in many fields such as pattern recognition, segmentation, object detection and classification. Studies proved that deep learning neural networks provide outstanding performance compared to standard Artificial Neural Networks (ANNs) (6)(7)(8) .
The previous work designed a Modified Ant Colony Optimization (MACO) with Modified Adaptive Network-Based Fuzzy Inference System (MANFIS) for thyroid ultrasound image classification (9) . It has issue with training time. In order to eliminate the dependency and improve the diagnostic accuracy, the proposed system designed a deep learning based Thyroid nodule classification which minimizes the error between the observed and predicted data. Initially, the thyroid images are segmented by using Region based Active Contour scheme. An Improved Local Binary Pattern (ILBP), Grey Level Co-occurrence Matrix (GLCM) and Histogram of Oriented Gradient (HOG) features are extracted from segmented image. Dynamic Mutation based Glowworm Swarm Optimization (DMGSO) algorithm is utilized for optimal feature selection. Finally, the Long-Short Term Memory (LSTM) scheme is utilized for classifying the thyroid nodule.
The rest of this study is outlined as follows: Section 2 details a survey of the thyroid nodule classification methods. Section 3 explains the functioning of the developed model. Section 4 presents the numerical results. Section 5 presents empirical findings of this research work.

Proposed Methodology
The proposed system focuses on diagnosis of thyroid nodules based on Dynamic Mutation based Glowworm Swarm Optimization with Long-Short Term Memory (DMGSO-LSTM) scheme. Figure 1 shows the block diagram of the developed model for thyroid nodule classification.

Preprocessing using dynamically weighted median filter (dwmf)
In this proposed research work, DWMF is used for pre-processing. To obtain noisy free images, W X W widow is formed using 2D Gaussian surface function. Let the input noisy image I and binary image I b . W X W window, W n and W b are chosen using identified noisy pixels in both I and I b respectively. Window weight W wt is computed and moved over to the I b , pixels, they are discarded where W b have the value of 1. Detected noisy pixels assigned with 0 and shifted if gaps are observed in W wt due to the elimination of noisy pixels. For an instance, if the W wt of window is 4, then the pixel is named as noisy pixel and removed. Therefore, if 2 is detected when shifting from W wt of 3 to 5 in W wt . Weights are reallocated to reduce duplications. The modified window is added. The W wt which is having high value is incremented by 1 if the sum is even. There is no change if the sum is odd. Final window is obtained after checking the odd sum of repeated windows to create repetition array A R . Noisy pixel is substituted by median value of A R .

Segmentation
Localized Region based Active Contour (LRAC) is adopted to do segmentation process. Research studies proved that LRAC using level set method is a good candidate for thyroid nodule classification. LRAC segment of the images in two processes are: (i) curve evolution and (ii) segmentation. Curve evolution uses level set to detect boundary of the images. Based on the detected boundary, active contour scheme segments the image from its background. The advantages of LRAC are: robust to noise and automatically detect boundary.

Feature extraction methods
Feature extraction is used to extract the most important attributes from segmented image. It is very difficult to select useful information from medical images. Over the past years, several feature extraction methods have been proposed and each method has their own characteristics. No one algorithm or method can extract all the important features for thyroid nodule classification. To solve this the proposed system extracts the ILBP, GLCM and HOG features from segmented image. https://www.indjst.org/

Improved Local Binary Pattern (ILBP)
LBP is an image processing method that is used to extract texture features. The merits of LBP are easy implementation and faster operation time, improves the classification performance of ILBP and assigns every uniform pattern to a separate label ranging from 0 to P (P −1) + 1. In ILBP, the oriented mean and standard deviation of the local absolute difference are considered to make the matching more robust against local spatial structure changes. To minimize the variations of the mean and standard deviation of the directional differences, a scheme that minimizes the directional difference along different orientations adds the parameter w.

Grey Level Co-occurrence Matrix (GLCM
GLCM is commonly used method of extracting textural feature from images. GLCM represents the relation between reference pixel (i) and the neighbor pixel (j) in various orientations. Texture features are calculated using GLCM are contrast, correlation, homogeneity and energy.

HOG features
The HOG is a method for extracting representative features. It extracts the features based on local object appearance and its shape are defined by intensity distribution. In HOG method, input image is divided into many groups and then histogram of gradients is calculated (10) . The obtained histograms are added to obtain image descriptor. Local histogram method is applied to enhance the representation of the image descriptor (11) . The intensity values are then used to standardize all cells within the block. The steps to extract HOG features are presented in Table 1 Table 1. https://www.indjst.org/ Table 1. HOG feature extraction algorithm Step 1: Compute the gradient value in both horizontal and vertical directions using Equation (1) and Equation (2) respectively.
Step 2: Calculate HOG. Orientation binning is the process of creating cell histograms. Histogram channels are either unsigned or signed. The signed histogram spans from 0 to 180 degrees whereas unsigned histogram spans from 0 to 360 degrees. Based on the computed value, each pixel is grouped.
Step 3: Create descriptor blocks. The cell orientation histograms are grouped into greater and spatially linked blocks before they can be standardized. Grouping process makes the image robust to illumination and contrast variations. Rectangular (R-HOG) and circular (C-HOG) are the widely used methods. The R-HOG is typically a square grid that can be described with the number of cells, the number of pixels, and the number of histogram channels. The blocks join with each other for a magnitude of half size of a block.
Step 4: Block normalization can be defined as: L2-norm : (9) Where, e is a constant whose value will not influence the result

Based Glowworm Swarm Optimization (DMGSO
The Glowworm Swarm Optimization (GSO) is a type of metaheuristic algorithm (12) . Conventional GSO has limitations such as slow convergence and need more time to do global search. In this proposed research work, DMGSO is utilized for optimal feature selection. The proposed DMGSO algorithm uses mutation strategy to overcome drawbacks of conventional GSO.
In GSO, swarm of glowworms are initialized randomly, each agent carries a luminescence quantity and each agent is attracted by another agent based on luciferin intensity. Higher the intensity of luciferin represents the better solution in current location. In each epoch, glowworm position will change based on the brightness. Detailed steps of DMGSO is given below: 1. Glowworms' initialization 2. Luciferin update phase 3. Movement phase 4. Neighborhood range update phase • Gloworms initilization: Glowworms are initialized randomly and epch is set to 1 • Luciferin-update phase: Calculates fitness of glow warm, if the current fitness is better than previous value, updates the position and luciferin. The luciferin update rule (objective function based on the features) is done by using Equation (10) Where, l i (t) is the luciferin of glowworm i at time t, ρ is the luciferin decay constant (0< ρ < 1), γ represents the luciferin enhancement constant, and J i (t) is the function value • Movement phase: Glowworms search a neighbor by a probabilistic mechanism that has higher luciferin value and move to it. For each glowworm i, the probability equation of moving towards a neighbor j can be stated as https://www.indjst.org/ Let glowworm i select a glowworm j ∈ N i (t) , l i (t) < l j (t)} is the set of neighbors of glowworm i , r i d (t) is the variable local-decision domain, andd i, j (t) represents the Euclidean distance between glowworms i and j at time t.
Where, x i (t) is the location of glowworm i at time t; s is the step size, and||.|| is the Euclidean norm operator.
• Neighbourhood range update rule: In the GSO algorithm, local domain value is updated by using the Equation (14), Here, β is a constant and n t is a parameter to control the neighbor number.

Dynamic mutation strategy
Dynamic mutation strategy is applied to G best as follows: Where, F-Scale factor X a and X b -two random particles with unequal fitness value in the swarm. Mutation strategy is adopted to improve classification performance. In mutation strategy, if the fitness value is better than that of the current feature, mutated feature is selected as best one and positions are updated.
Algorithm 1: Dynamic mutation based GSO

Classification using enhanced LSTM
LSTM is a special form of Recurrent Neural Network (RNN). Though standard RNN is a good candidate for complex problems, it has limitations like vanishing gradient problem (13) . To overcome the drawbacks of standard RNN, LSTM is introduced. LSTM cell consists of four major parts namely input unit, forget gate, output gate and activation part. Figure 2 depicts the general structure of a LSTM cell. Input unit receives the signal from external world. Forget is responsible for eliminating unwanted information.
The input gate of LSTM is defined as The forget gate is defined as The cell gate is defined as The output gate is defined as Finally, the hidden state is computed as tanh -hyperbolic tangent activation function x t− input at time t W and b are the weight and bias respectively, σ is the logistic sigmoid function, and i, f, o and c are respectively the input gate, forget gate, output gate and cell state. W ci , W c f and W co are denotes weight matrices for peephole connections. Input gate i, forget gate f and output gate o are responsible for information processing. Equation (18) calculates the cell state. The forget gate decides whether the previous information passed to the next state or not. Output gate computes the outcome of the LSTM using Equation (19). Hidden state is calculated with Equation (20).
In this proposed work, bias values are updated using weighted average of the features.
Weighted average Bias =

Empirical study
The developed system is implemented on MATLAB platform. Several experiments are conducted in order to assess the efficacy of the proposed system. In this research work, thyroid images are collected from http://cimalab.intec.co/applications/thyroid/ . This system has been mainly used for thyroid nodules that are ≥1 cm. The performance of the proposed DMGSO-LSTM and existing Histogram, MLP, ILBP-ASO and MACO-MANFIS methods are evaluated by measuring four commonly used metrics such as accuracy, precision, recall and f-measure. Figure 3 shows the main menu and collection of database images are given in figure 4. The given input images are preprocessed with the help of Dynamically Weighted Median Filter (DWMF). The pre-processed image is shown in Figure 5. The pre-processed images are segmented by using localized region based active contour scheme. The segmentation results are shown in Figure 6. The Figure 7 represents the feature extraction results.
Based on the extracted features, the classification is performed by using Long Term Short Memory (LSTM) scheme. The output corresponding to the input is shown in figure 8. https://www.indjst.org/

Accuracy
Accuracy is the ratio of sum of correctly classified cases to total cases. Classification accuracy can be defined as:

Accuracy =
True positive + True negative True positive + True negative + False positive + False negative True positive is the sum of the correct classifications; true negative is the sum of incorrect classifications. False positive is the sum of the incorrect classifications that an actual case is negative and false negative is the total number of incorrect classifications that an actual case is positive. Figure 9 demonstrates the accuracy metric comparison for the existing and proposedmethods. In x-axis, methods are depicted https://www.indjst.org/ and accuracy in y-axis. In this proposed research work, optimal features are selected by using Dynamic Mutation based Glowworm Swarm Optimization (DMGSO) algorithm. It improves the accuracy of the classifier. From the experimental outcomes, it is observed that the proposed system attains 99 % of accuracy when other methods such as Histogram, MLP, ILBP-ASO and MACO-MANFIS achieves 89%, 91%, 96% and 98% respectively.

Precision
Precision is the ratio of the true positive to the sum of true positive and false positive. It can be expressed as: The precision of the proposed DMGSO-LSTM is compared with the existing Histogram, MLP, ILBP-ASO and MACO-MANFIS approaches. The x-axis shows the methods and precision depicted in y-axis. The experimental results shows that the proposed DMGSO-LSTM approach attains 96% of precision when other methods such as histogram, MLP, ILBP-ASO and MACO-MANFIS provides 86%, 89%, 92% and 94% respectively.

Recall
Recall value is the ratio of the true positive to the total of true positive and false negative. Mathematically, recall can be defined as:

Recall =
True positive True positive + False negative https://www.indjst.org/  Figure 11 demonstrates the recall metric comparison for the methods. The x-axis has methods and recall is depicted in y-axis. In this proposed research work, enhanced LSTM approach used for the classification of the thyroid nodules with the selected features. Here, the bias values are updated using weighted average of the features. It improves the true positive rate. From the experimental outcomes, it is observed that the proposed system attains 95 % of recall when other methods such as Histogram, MLP, ILBP-ASO and MACO-MANFIS achieves 81%, 85%, 89% and 91% respectively. https://www.indjst.org/

F-measure
F-measure is used to evaluate the classification accuracy. It is calculated by using precision and recall.
The proposed DMGSO-LSTM is compared with the existing Histogram, MLP, ILBP-ASO and MACO-MANFIS approaches in terms of f-measure. In x-axis methods depicted and f-measure in y-axis. Figure 12 demonstrates that the f-measure of the proposed DMGSO with LSTM algorithm provides 96% when other methods such as Histogram, MLP, ILBP-ASO and MACO-MANFIS achieves 83%, 86%, 90% and 92% respectively.

Conclusion
The proposed system designed a Dynamic Mutation based Glowworm Swarm Optimization with Long-Short Term Memory (DMGSO with LSTM) scheme for thyroid nodule classification. DWMF was utilized for removing unwanted data from the input images. Pre-processed images are segmented with region based active contour schemes. Three features such as ILBP, GLCM and HOG are extracted and optimized by using DMGSO algorithm. LSTM network was employed for classifying the thyroid nodule. Empirical findings demonstrated that the proposed method yields higher classification accuracy when compared to other existing models.