An Efficient Medical Image Retrieval and Classification using Deep Neural Network

Background/Objectives: The main objective of this work is to obtain an efficient brain tumor image retrieval and classification using Deep Neural Network (DNN).Methods/Statistical analysis: The features from the medical images are extracted by using tamura feature extraction, Local Ternary Pattern (LTP) and Histogram of Oriented Gradients (HOG). Subsequently, an Infinite Feature Selection (Inf-FS) technique is incorporated to select optimum features from feature vector, which leads to improve the classification process using sparse auto encoder based DNN. Furthermore, the retrieval performance of the proposed method is improved by Euclidean Distance technique. Findings: An Open Access Series of Imaging Studies (OASIS) and Contrast Enhanced Magnetic Resonance Image (CE-MRI) datasets are utilized to analyze the proposed method. The sparse auto encoder based DNN classification scheme yields an overall accuracy of 95.34% in OASIS dataset and 99.87% in CEMRI dataset with improved sensitivity, specificity, error rate. The retrieval performance of proposed technique is assessed in terms of Average Retrieval Precision (ARP) and compared with two existing methods such as Local Mesh Vector Co-occurrence Pattern (LMVCoP) and Content Based Image RetrievalConvolutional Neural Network (CBIR-CNN). The ARP of the proposed method for CE-MRI and OASIS dataset is 98.33% and 88.25% that is high when compared to the CBIR-CNN, LMVCoPmethod.Novelty/Applications: An appropriate feature selection using Inf-FS and DNN based nonlinear feature data classification are used in the applications of medical image retrieval.

• There are three texture feature extraction techniques such as Tamura, LTP and HOG that are used and then hybridized to enhance classification performance of MRI images. The main advantage of using Tamura is feature derivation based on the psychophysical context. Moreover, the HOG is chosen because of its geometric and photometric transformations and LTP is highly robust to noise. • Additionally, the optimum features are chosen from the feature vector using Inf-FS technique as it helps to assess important features along with its subsets of features from the images. These attributes are learned through DNN based classification scheme. • Moreover, the reduction of irrelevant features from the feature vector helps to improve the classification accuracy. The Euclidean based image retrieval technique is utilized to retrieve brain MRI images from a huge database.

Literature survey
This section provides a literature survey of recent trends in the medical image retrieval system. Generally, the process and analysis over the brain tumor of MRI images are challenging task for the computer vision scientists. Moreover, the selection of the brain tumor is also considered as a crucial step during brain tumor classification. On the other hand, there are various healthcare applications are developed such as classification of stomach infections (26) , recognition of blood cells (27) and Wireless Capsule Endoscopy for abnormality detection (e.g., Ulcer, gastric infection) (28) (29) (30) . In (31) developed a brain parcellation tool wherein all the anatomical images were converted into a standardized index through image transformation and atlas-based parcellation technique. The primary progressive aphasia was selected because of the apparent degeneration of cells at various degrees and locations in patients. Furthermore, the Partial Least Squares-Discriminant Analysis (PLS-DA) and Principal Component Analysis (PCA) were used to perform individual classifications over various medical images. Several critical information such as distance/ clinical severity was not included in the analysis.
In (32) presented the DNN and Extreme Learning Machine (ELM) based brain image classification using prominent features. The contourlet transform and Zernike moments were used to extract the texture and shape features respectively. Then the Genetic Algorithm (GA) and Particle Swarm Optimization (PSO) based feature selection was utilized to hybridize the features. The redundant features were removed by using the GA and PSO based feature selection. The DNN and ELM based methods classify only the brain tumor images from the database, it fails to classify the type of brain tumor.
In (33) presented the combination of Local Pattern Descriptor (LPD) and Gray-level Co-occurrence Matrix (GLCM) as a feature extraction approach to extract the features from medical imageries. This work implements a Local Mesh Vector Cohttps://www.indjst.org/ occurrence Pattern (LMVCoP) by integrating the Local Vector Co-occurrence Pattern (LVCoP) and Local Mesh Co-occurrence Pattern (LMCoP). An OASIS database consisting of MRI brain images was utilized to evaluate LMVCoP methods. In clinical applications, fast decision making was achieved based on LMVCoP based image retrieval. This method fails to retrieve in the extensive high dimensional datasets.
In (34) developed the Co-Active Adaptive Neuro Fuzzy Inference System (CANFIS) classifier and Support Vector Machine (SVM) for diagnosing the brain tumor. The features from LTP and GLCM were learned using SVM and CANFIS classifiers to categorize the brain images as normal and abnormal. From the classified abnormal image, the tumor region was segmented using normalized graph cut approach. The value present outside of the learning database was not effectively determined by CANFIS and a huge volume of database was required for the classification model.
In (35) presented the CBIR system for brain tumor images. The T1-weighted CE-MRI dataset images were taken for validating the image retrieval performance. Here, the feature extraction was performed using the deep Convolutional Neural Network (CNN) model (VGG19). Additionally, the retrieval performances were enhanced by using the transfer learning and block-wise fine-tuning strategy. The results show that this method works well only for small datasets, particularly in the medical imaging domain, but fails to achieve in large datasets.
In (36) presented the deep learning-based method for microscopic brain tumor detection and tumor type classification. The brain tumor was extracted by using the 3D Convolutional Neural Network (CNN) architecture of the two different datasets such as BraTS2015 and BraTS2018. Next, the extracted tumors were transmitted to the pretrained CNN model for extracting the features. An optimum feature was obtained by giving the extracted features to the correlation-based selection method. Hence the final classification was achieved based on the validation using feed-forward neural network. However, the classification time was increased for the 3D CNN based tumor segmented image.
In (37) developed the hybrid technique to extract the tumors the MRI image. This hybrid technique was categorized into five steps such as de-noising of an image, the extraction of the tumor, feature selection, feature fusion, and classification. Initially, the image denoising was carried out in two different datasets such as BRATS2013 and private dataset using the curvelet transformation. The ant colony optimization was used along with the Thresholding method to extract the tumor based MRI scans. A multi type features such as shape and texture were extracted from the tumor images. Next, the PCA reduced skewness method was used to reduce the irrelevant features. Further, the selected features were given to the SVM to classify the tumor portions. But, the SVM classifier was required 70% of features to classify the brain tumors.
In (38) presented the 3D reconstruction algorithm for CT image features by considering the multi-threaded deep learning calculations. Next, the segmentation of image feature texture was obtained using the DeepLabv2 algorithm. From the CT image sub-block content, an optimal feature volume data was acquired that used to solve the issue of packet loss while processing the feature extraction. Next, the Compute Unified Device Architecture (CUDA) was used to obtain the multi-thread calculation of CT image features that used to resolve the serial processing of the contents using multiple sub-blocks. This CUDA was created longer reconfiguration time when it processed with multiple slice CTs.
The machine learning method helps in extracting the features and it performs the classification. Accordingly, the complexity and time consumption was increased, when the extracted features were highly dimensional (39) (40) . The feature selection and reduction algorithms were beneficial for both the accuracy and execution time in medical imaging applications. Additionally, the selection of significant feature minimizes the overall complexity of the system (41) . Hence, the proposed uses the Inf-FS method to select the optimal features from the feature vector. This relevant feature selection from the feature vector is used to improve the classification accuracy of sparse auto encoder based DNN.

Proposed method
In the proposed method, an effective image retrieval over the medical images is performed using optimal feature extraction, DNN and Euclidean distance techniques. The steps processed in this medical image retrieval system are given as follows: 1) CLAHE based image enhancement, 2) Feature extraction, 3) classification using sparse auto encoder based DNN and 4) Euclidean distance based image retrieval. Figure 1 depicts the overall architecture of the proposed method.

Data acquisition
In the proposed method, two different databases such as OASIS (42) and CE-MRI (43) are used to analyze the image retrieval performance. Generally, the OASIS repository contains a collection of cross-sectional images under the age group of 18 to 96. Additionally, the CE-MRI dataset has three types of brain tumor images such as Glioma, Meningioma, and Pituitary tumor. The collected images from the database are subject to the CLAHE method for enhancing the image quality. The sample images from the OASIS and CE-MRI are shown in Figure

Image enhancement using CLAHE
The CLAHE method preprocesses all the collected images from the OASIS and CE-MRI dataset. The nonlinear property of the CLAHE method is used to modify the intensity values to enhance the contrast of images. The clipping limit of CLAHE is utilized to overcome the problem of noise amplification wherein histogram is clipped at predefined value to limit the amplification process. In this approach, an original image is separated as non-overlapping contextual regions and two main parameters such as the size of the block and clip limit are used to control the image quality. The process of CLAHE is described as follows: • Initially, the original image (I) is divided into non-overlapping contextual regions. In that case, the total amount of image tiles equals to M × N. • For every contextual region, histogram is calculated for the array image. • The clipped histogram and gray level mapping are generated in CLAHE. The pixels are equally separated at each gray level and the average number of pixels is given by equation (1) where, N avg represents the average amount of pixel and N gray represents the amount of gray level in the contextual region. The amount of pixels present in X and Y dimension are N r X and N r Y respectively. The clip limit is expressed in the equation (2).
where, the N CL represents the actual clip limit and N clip is the normalized clip limit in the range of [0,1]. For example, the preprocessed image using CLAHE is shown in Figure 3. The preprocessed images (I ′ ) are given to the feature extraction for subsequent process. https://www.indjst.org/

Feature extraction
In this proposed method, three different feature extraction methods such as tamura, LTP and HOG descriptor are utilized to extract the features from the preprocessed image. The tamura feature extraction is generally used for extracting the texture features which is used to describe the image textures based on the coarseness, contrast and directionality. The main advantage of the LTP is resilient to the noise and the image's orientation is based on the global features from the HOG descriptor. The combination of these features is essential to enhance the classification accuracy of the different classes of MRI brain images.

Tamura feature extraction
Tamura is basically a texture-based feature extraction utilized to represent the image content and it is extensively used in image searching and retrieval. In tamura, three features are extracted such as coarseness, contrast and directionality.

• Coarseness
The way of computing the size of texture element is known as coarseness and it can be expressed as: where, the difference of average intensity in horizontal and vertical directions are represented as k = argmax{E 1 , E 2 , .., E k , .., E L }. E and S best (i, j) = 2 k . Moreover, this average intensity E is calculated by using the grey level position of image I ′ .

• Contrast
The intensity difference at an image's local region is defined as contrast that is expressed in equation (4).
where, the standard deviation is represented as σ , the squared variance is represented as σ 4 and represented as k = argmax{E 1 , E 2 , .., E k , .., E L }. E and S best (i, j) = 2 k . Moreover, this average intensity E is calculated by using the grey level position of image I ′ . fourth moment of the mean is µ 4 . Here, the value of n 0 for contrast is fixed as 0.25. https://www.indjst.org/

• Directionality
The directionality provides information about global feature that is extracted from the small region and these features are represented based on the local edge's histogram as shown in equation (5) where, the quantized direction code is represented as ϕ , the histogram peak value is P, the Pth peak range among valleys is denoted as W P and the histogram's Pth peak position is represented as ϕ P .

Local ternary pattern
The LTP is generally an extension of LBP wherein the neighbor pixel value is encoded into 3 valued codes based on the user threshold. The obtained binary pattern from the LTP is utilized as the element of local image descriptor. Therefore, the LTP allocates the label to each pixel p i of an image (I ′ ) based on thresholding value. The central pixel p c and adjacent values are used to assign the label. Equation (6) and (7) expresses the LTP operator.
where, the user-defined threshold is represented asw.

HOG descriptor
HOG descriptor generally depends on the gradient directions of the pixel wherein the pixel's HOG descriptors are computed by considering a small spatial region that is specified as a cell. Here, the feature vector is taken from the concatenation of 1D histogram and this feature vector is used for further process. The preprocessed image I ′ is separated into cells as N × N pixels and these pixels are subjected to further processing by hog descriptor wherein the pixel's orientation of the gradient can be expressed as: Here, the orientations of θ j i , i = 1, . . ., N 2 related to the same cell j are quantized and accumulated in feature vectors. Moreover, the obtained histograms are arranged and concatenated in a unique HOG histogram which is considered as the final resultant of HOG feature vector.
The feature vector (F) is generated by using the features from the tamura, LTP and HOG descriptor. The following equation (9) represents the generated feature vector

Infinite feature selection
The prime advantage of Inf-FS approach is to highlight the importance of particular features including all possible feature subsets. The appropriate feature selection is used for constructing the low dimensional feature vectors to reduce the time consumption and to increase classification accurateness. The set of features given to the feature selection is F = { f (1) , f (2) , . . . , f (n) }. Ultimately, the graph G is fully connected, such that each vertex is related to the features and edges which specifies the model pairwise relations between the features. The G is represented by Z and the nature of edges is specified by each element z i j of Z that is as represented by equation (10).
Where, the loading coefficient is α ε[0, 1], standard deviation through the samples is σ (i) and the correlation coefficient is represented as a spearman coefficient. The pairwise energy shows that feature between f (i) and f ( j) is discriminative. https://www.indjst.org/ The finite path among the vertices i and j is represented as γ = {u 0 = i, u 1 , . . . , u l−1 , u l = j} that is the subset of feature pairs over the path and energy of γ is given by equation (11) The i th feature energy is computed by maximizing the path length to infinity. The i th feature energy is expressed in the following equation (12).
Where, the vertices set is represented as U; path length among i and j is Q l i, j . The divergence is happening due to the sum of infinite Z l . The real-valued regularization factor of equation (13) is used to assure the infinite sum convergence.
The v R (i) is calculated effectively by utilizing the convergence property of geometric power series of the matrix that is given in the equation (14).
The quantity marginalization is used to obtain the final energy score for each feature which is expressed in the equation (15).
The selected feature ranks are obtained by the v R(i) and energy values are arranged in the descending order. The selected features are represented as Y . This Inf-FS is utilized to select only noteworthy features from the feature set i.e., 2585 features are selected from the 8615 features and these selected features are given to the DNN for improving the classification.

Sparse autoencoder based DNN for classification
The autoencoder is applied in each layer of the DNN (44) . The given features are used to compute each label's probability related to the residue. The input feature given to this encoder based DNN is Y = (y 1 , y 2 , . . . , y i , ..y M } , y i εS L , where the amount of features is L and M is the length. The input to this sparse autoencoder is selected features from the infinite feature selection. In this neural network, the sigmoid function is used as an activation function. The feature representation for the input matrix Y is shown in the following equation (16).
The conventional autoencoder detects the approximation to the identity function. In sparse auto encoder, the amount of active neurons is constrained by adding the sparse penalty term in the objective function. The equation (17) shows the cost function of the sparse autoencoder.
Where, the average sum of squares is represented in the first term and the number of examples in the training set is L. The relative weight of regularization term is controlled by using λ . The number of hidden neurons is s 2 , the sparsity penalty weight is β and the Kullback−Leibler divergence is KL (.) . The Kullback−Leibler divergence is expressed in the equation (18).
Therefore, this sparse auto encoder based DNN is helpful for classifying different classes of MRI images. https://www.indjst.org/

Euclidean distance based medical image retrieval
In image retrieval, the Euclidean distance measurement technique computes the similarity measure between the query image and retrieved image. This Euclidean distance based similarity measure is used due to its efficiency and effectiveness. The square root of the sum of the squared absolute differences are calculated to measure the distance among the two vectors of images. The matching between the query image and images from the database is expressed as: In this proposed system, an effective image retrieval and classification of image types are achieved by using an appropriate feature extraction and classification. Additionally, the Inf-FS is used to select optimum features among generated feature vectors. The DNN based classification is used in this medical image retrieval due to its robustness against huge databases.

Experimental setup and performance evaluation
The experimental setup and performance evaluation of the medical image retrieval system is described in this section. The proposed classification and retrieval method is applied over two different databases such as OASIS and CE-MRI. The simulation tool namely MATLAB R2019a used to implement the medical image retrieval with 4GB RAM as hardware requirements. In both the databases, 80% of the images are taken for training and 20% the images are used for testing the retrieval performances. The performance metrics used to analyze the proposed method is accuracy, sensitivity, specificity and error rate and it is explained as follows: Accuracy: Accuracy is defined as the percentage of correctly identified brain images from the OASIS and CE-MRI dataset. Sensitivity: Sensitivity is the amount of actual positive images that are correctly identified during classification. Specificity: Specificity is the amount of actual negative images that are correctly identified during classification. Error rate: The error rate is defined as the amount of error occurred during the image retrieval.

Average retrieval precision (ARP):
It is defined as the ratio between the average of the total number of relevant images retrieved and total retrieved images.

Performance evaluation of proposed method
The proposed method's performance over the OASIS and CE-MRI datasets are analyzed in this section. The performances are evaluated as classification and retrieval results over medical images. In classification, the MRI images are classified as seven classes such as meningioma, glioma, pituitary, group1, group2, group3 and group4. The images exist in each group are randomly selected based on its ventricular shape. The classification performance is improved by extracting the optimal features from the images using tamura, LTP and HOG descriptor. The classification results are assessed in terms of accuracy, sensitivity, specificity and E-rate. Classification results of the proposed method are validated with two different classifiers namely Random Forest Classifier (RFC) and Neural Network (NN). Moreover, the classification results are analyzed for individual features and hybrid features. For the hybrid features, the classification is performed by with and without feature selection. Also, based on the output of the classifier, TP, TN, FP, FN values are estimated and from these obtained values, various performance metrics like accuracy, sensitivity, specificity and error rate are calculated for both OASIS and CE-MRI dataset. Tables 1 and 2 , the accuracy of neural network and RFC for each of the single features is comparatively less. However, for hybrid features employing Infinite feature selection has better performance than without infinite feature selection method. Forexample, the accuracy of RFC for OASIS dataset is 61.39 % that is high when compared to the accuracy of NN i.e. 60.83%. https://www.indjst.org/ Similarly, the performance analysis for OASIS dataset is depicted in Table 3 . From Table 3, the proposed method has better classification accuracy of 95.34% with feature selection that is high when compared to the accuracy of NN and RFC that is 60.83% and 61.39% respectively. The combination of high level feature (LTP, HOG) and low level feature (Tamura) helps to improve the classification accuracy. The performance of the classifier with Inf-FS provides better performance than the classifier without Inf-FS, because the Inf-FS is used to avoid the irrelevant features obtained from the LTP, HOG and Tamura. Here, the Inf-FS selects the relevant features from the feature vector based on the ranking order. Finally, the performance metrics for each of the classifier w.r.t feature selection mechanism for hybrid features are graphically presented as depicted in Figure 4 . The performance analysis for the CE-MRI dataset is shown in Tables 4, 5 and 6 respectively. From Table 6 , the proposed technique has better classification accuracy than the RFC and NN. For example, the accuracy of the proposed method for CE-MRI dataset is 99.87% with feature selection that is high when compared to the accuracy of NN and RFC that are 85.79% and 96.67% respectively and the same has been graphically presented in Figure 5 . The relevant feature selection of Inf-FS and https://www.indjst.org/ Here the Euclidean distance based image retrieval is used to retrieve the medical images. The Euclidean distance is computed between the query image vector with trained image feature vector. The retrieval results of proposed method is calculated in terms of Average Retrieval Precision (ARP). Similar to classification, the retrieval results are carried out for two different datasets with seven different classes, namely meningioma, glioma, pituitary, group1, group2, group3 and group4.   The ARP for OASIS and CE-MRI datasets are shown in Tables 7 and 8 respectively. The ARP is mainly calculated to analyze the retrieval performances for OASIS and CE-MRI datasets. Also, Figures 5 and 6 shows that the proposed method retrieves the appropriate images from the OASIS and CE-MRI dataset respectively. The ARP for OASIS dataset varied for each class due to the similar sizes of images present in the group1, group2, group3 and group4. Similarly, the ARP for CE-MRI dataset also varied for each class due to similar tumor portions of meningioma, glioma and pituitary tumor.

Comparative analysis
The average retrieval precision of the proposed method is compared to the existing techniques to know the effectiveness of the proposed method over medical image retrieval. The existing methods used for the comparison are LMVCoP (33) and CBIR-CNN (35) . In LMVCoP (33) , two different feature extraction approaches are used such as LDP and GLCM. The LMVCoP is a combination of the LVCoP and LMCoP. The deep CNN based method is used for feature extraction and transfer learning, block-wise fine-tuning strategies are utilized to perform the medical image retrieval. The image retrieval of LMVCoP (33) and CBIR-CNN (35) are analyzed over the OASIS and CE-MRI datasets respectively. Table 9. ARP comparison of proposed method with existing method OASIS dataset CE-MRI dataset LMVCoP (33) Proposed method CBIR-CNN (35) Proposed method 71.58 88.25 96.13 98.33 The comparative performance of the retrieval results is given in Table 9 . Here, the comparison is taken in terms of ARP for analyzing the retrieval performance in OASIS and CE-MRI datasets. For example, the graphical illustration of the ARP performance for OASIS dataset is given in Figure 8 . From the results, the proposed method has improved performance than the LMVCoP (33) and CBIR-CNN (35) during the medical image retrieval. The ARP of the proposed method is obtained as 98.33 for CE-MRI dataset, due to the combination of low and high level features and selection of relevant features from the feature vector. The higher ARP in CE-MRI dataset results in high classification accuracy and precision. Both LMVCoP (33) and CBIR-CNN (35) failed to achieve higher ARP in large datasets. The reason behind the higher retrieval performances are DNN and Euclidean distance based image retrieval over the variety of images. Since, the classification is improved by using an appropriate feature extraction and Inf-FS based feature selection.

Conclusion
In this work, an effective medical image retrieval and classification is obtained by using the Euclidean distance based retrieval and sparse auto encoder based DNN respectively. The classification performance is enhanced by using Inf-FS over the feature vectors of tamura features, LTP and HOG. This Inf-FS is used to remove the irrelevant features from the feature vector that helps to improve the classification accuracy. In that, tamura features define the texture features of the image and HOG descriptor provides the orientation of the image. Moreover, the Euclidean based image retrieval gives better performance over medical images due to its efficiency. The classification performance of the proposed method is analyzed with two classifiers namely NN and RFC. The accuracy of the proposed method through the CE-MRI dataset is 99.87% that is high when compared to the NN and RFC. The proposed method with Inf-FS for CE-MRI dataset has higher accuracy of 99.87% when compared to the without Inf-FS i.e., 99.17%. Moreover, the ARP of the proposed method for OASIS dataset is 88.25% that is high when compared to the LMVCoP method. In the future, an ensemble feature selection method can be used to improve the classification performances of the brain tumor.