An unconventional SVM classiﬁcation using Chaos Pso optimization for lung cancer discovery

Objectives : The main purpose of this work is to detect the cancer region and to classify the particular region based on Support Vector Machine (SVM) classiﬁer. Methods: Optimization technique is used after classifying the cancerous region in order to improve the accuracy of the Lung cancer CT images. The proposed method is improved using a novel Chaos Particle Swarm Optimization (CPSO) technique. The MATLAB is used to optimize the technique. Findings: The achieved accuracy of SVM classiﬁer using CPSO is 97.4% which is higher when compared to PSO, Genetic algorithm which yields an accuracy 89.5% and genetic optimization for feature selection and ANN for lung cancer classiﬁcation which obtains 95.87% accuracy.


Introduction
Lung cancer is the most common disorders which affects the people and can reduce the survival rate. The medical professionals are able to diagnose the diseases with the help of CT images. To reduce the mortality rate of cancer patients, initial identification of lung cancer with proper treatment is needed. But the initial identification is not an easy task which requires some fundamentals image processing steps followed by Optimization technique (1) . Image processing plays a vital role in disease diagnosis from medical imaging (2) . In this paper a novel approach for lung cancer classification is proposed, in which Chaos PSO optimization is utilized. The simulated results have been obtained based on the performance measures. In (3) proposed the modified bacterial foraging optimization technique for lung cancer classification which evaluates the performance parameter. It shows the simulated result for back propagation neural network is better than SVM method. MLP-NN lung disease classification technique using PSO algorithm which improves the feature selection and also analyzed the accuracy, bit error rate (4)(5)(6)(7)(8) . In (9) studied various ways to detect lung cancerous region from the CT image based on PSO and Genetic algorithm. The obtained accuracy of this method is 89.5%. A new idea for identifying the lung cancer status which uses genetic optimization for feature selection and ANN for lung cancer classification. This technique provides the accuracy https://www.indjst.org/ of 95.87% (10) . From the literature review it is observed that the accuracy is improved using Optimization techniques. Hence, the proposed method uses Support Vector Machine using Chaos PSO Optimization techniques is applied on lung CT images to improve the accuracy of simulated results. The primary contribution of this paper is to detect the lung cancer region using SVM classifier with novel CPSO algorithm in order to improve the accuracy.
This study is structured as further as follows: Section II gives the detailed explanation of Proposed system. Section III shows the simulated results and Finally, the conclusion of this paper is explained in Section IV.

Materials and Methods
The basic steps required for image processing with Optimization techniques are used in the proposed method along with the steps required for the implementation of the system is mentioned in Figure 1.

Image acquisition
The capturing of input image from the database is done through image acquisition process, the image obtained from the database will be in DICOM format. The images obtained for proposed research is from Lung Image Database Consortium (LIDC) datasets. This dataset contains data of 50 patients.

Image pre-processing
Generally, CT images will contain a superior amount of noises, so a noise removal process is mandatory for further processing. Image preprocessing step is used to perform the background noise removal. The median filter is used to suppress the salt and pepper noise from the image. Due to its de-noising power and computational efficiency characteristics median based filters are supposed to act as backbone filters (11) .

Image segmentation
Segmentation is the process used to separation the medical images into various regions. The aim of segmentation is to shifting the representation of lung image into meaningful, which smoothen the analyzing process of the particular image without any hitches. Adaptive global thresholding is the simplest process used to segment the lung images. It separates the cancer region and non-cancer region very efficiently from the CT lung images (12,13) . This method is mainly applicable were the single value threshold is not working accurately, at that time pixel value depends on its position of lung image.

Feature extraction
Feature extraction is mainly performed for the goal of dimensionality reduction. In this research we used Gray Level Cooccurrence Matrix, In which the total number of rows and columns are similar to the gray levels, G 0 . The Figure 2 shows the representation of Gray Level Co-occurrence Matrix (GLCM) (9) .

Image classification
After the reduction features of GLCM technique the obtained result is provided as the input for SVM classifier (14) . Support Vector Machines is a binary classification method used to analyze data and recognize the patterns that receipts a group of https://www.indjst.org/ inputs and predicts the input between two choices present, which makes SVM as non-probabilistic binary linear classifier. It is said to be supervised learning models that uses kernel function as shown in Figure 3.

Chaos particle swarm optimization
This paper uses CPSO algorithm which is a population-based algorithm for solving the thresholding problem. It is a multilevel thresholding technique which accurately finds the threshold value with a fitness function by dividing the target image. So that the optimization problem can be solved. This optimization looks the updated iterations after defining with a set of random particles (15) . The experimental result shows the improved accuracy than the existing technique.

Experimental Results
In this research the simulation result yielded by using MATLAB. The inputs for this experiment are the images taken from LIDC dataset. The proposed system uses totally 50 number of images. In the dataset 75% data is used for training and 25% used for testing process.
Initially, the input image is preprocessed using median filter. After that the output of preprocessed image is segmented to isolate the cancerous region as shown in Figure 4. The GLCM feature extracted values for contrast, correlation, energy, mean and homogeneity is shown in Table 1 All of the extracted features were besed on pixel information of segmented image region. The mean feature is proportionl to the brightness of the image (16,17) .  SVM classifier predicts the output as whether normal or abnormal as shown in Figure 5 and this method helps the medical professionals to find the lung cancer at the early stage to avoid the mortality rate in the Globe.  Figure 6 shows the performance analysis of SVM using Genetic Algorithm, SVM using PSO and SVM using Chaos PSO are compared the parameters such as accuracy, sensitivity and specificity. Performance is measured and proved that the proposed system achieves the better accuracy of 97%.

Discussions
Experimentation were carried out from real time datasets obtained from the reputed databases. The results achieved by the research is related with few similar works carried out using SVM classifiers the chosen few related research and the obtained results are discussed below.
In (18) totally 11 patients are considered with 1 normal case, 2 gentle cases and 8 malicious cases about 3278 sectional images. 50% of dataset are measured for training and 50% as testing phases. The performance results of the three kernels are measured using the parameters sensitivity and specificity and obtained approximately 82.2% of accuracy.
A method for segmentation (19) of MRI, CT and Ultrasound images. For identification of cancer cell is done by studying the necessary features extracted for the two images. Ultrasound images as well to detect the validity of this system. We used feature selection as well by the use of PSO, Genetic Optimization and SVM algorithm giving an accuracy of about 89.5% with reduction in false positive.
The Lung Image Database Consortium dataset (LIDC) has been used for training and testing purpose in (20) using classification effected by means of an SVM. A Receiver Operating Characteristics (ROC) curve is used to analyze the performance of the system. Overall, the system has accuracy of 95.16%, sensitivity of 98.21% and specificity of 78.69%. SVM provides the accuracy of 92.5% in (21) .
The achieved accuracy of SVM classifier using CPSO is 97.4% which is higher when compared to PSO, Genetic algorithm which yields an accuracy 89.5% and genetic optimization for feature selection and ANN for lung cancer classification which obtains 95.85% accuracy. While comparing the proposed method with the existing SVM-PSO and Genetic algorithm for accuracy, the proposed method yields a higher value than the existing methods. The accuracy of system can be improved if training is performed by using a very large image database. The different basic image processing techniques are used for prediction purpose. The proposed system helps physician to extract he tumor region and evaluate whether the tumor is benign or malignant.

Conclusion
The proposed research is to detect the lung cancer in early stage with the help of an CT image and this method works by classifying the normal and abnormal CT lung images from the given set of inputs. The classification process is carried out by Support Vector Machine classifier. The simulation is undergone by using MATLAB. Initially, the images were preprocessed by median filter, followed by adaptive global thresholding segmentation. The segmented output is processed using GLCM method for further classifying the images. As per the outcomes of the simulation process the obtained accuracy of SVM is 97%. The accuracy can be further improved by using SVM with Chaos PSO which yields higher accuracy than the other optimization techniques.
The proposed Chaos PSO along with SVM is compared with two existing methods namely SVM with Genetic algorithm and SVM with PSO in which the proposed method performed well in terms of accuracy of 97.6% in which other two methods obtained only 89.3% and 95.85% respectively.