A Novel Parameter Reduction Method for Simplifying Improved RC-IDeepRT

Objectives: To decrease themodel parameters in Reinforcement learning with Compressed Improved DeepRT (RC-IDeepRT) for Cyber-Physical System (CPS) data analysis, a method is proposed in this study. Methods: A parameter reductionmethod is proposed to decrease themodel parameter by combining pruning and factorization. The pruning may not save the memory usage in traditional computers unless sparse matrices are supported and explicitly used. However, it directly reduces the usage of synapses in neuromorphic architecture. Thismethodminimizes the total number of required neurons and synapses given a trained model. Findings: The integration of factorization and pruning allows creating sparsely connected reinforcement learning with deep learning network from a given trained network. The dangling connections in the network are determined and the remaining connections of the dangling neurons are further pruned. The pruned network is cost-effective and it classifies the CPS data effectively. Novelty: This proposed method tries to minimize the number of parameters in RC-IDeepRT. The number of parameters becomes a good surrogate metric assuming the deep learning architecture accommodates sparse networks effectively. The reduced model parameters are used for CPS data classification.


Introduction
Cyber-Physical System (CPS) (1)(2)(3)(4)(5)(6) which are designed to revolutionize society by introducing innovations is rising rapidly in the world. The dynamic nature of sensors in CPS (7) creates a huge volume of data which requires effective techniques for CPS data analysis. Because of gaining significant exposure with topics such as intelligent cities, smart homes with an appliance network, mobility services, and environmental monitoring, smart grids, the classification of CPS data is more required. An Improved DeepRT (IDeepRT) (8) was proposed to enhance the CPS data classification accuracy. IDeepRT is the integration of Convolutional Neural Network (CNN) (9) and Recurrent https://www.indjst.org/ Neural Network (RNN) which gets the best feature space as input and classified the CPS data. However, Improved Feature Space Partitioning Tree with IDeepRT (IFSPT-IDeepRT) is not more suitable for complex tasks.
So, RC-IDeepRT (10) was proposed which continuously corrects feature space representations within the IDeepRT towards salient features of the environment. Moreover, a compressed scheme was proposed to handle the complex CPS data analysis. In a compressed scheme, various rank selection using Bayesian matrix factorization, tucker decomposition on the kernel, finetuning to recover accumulated loss of accuracy are the processes to compress the IDeepRT layers. The deep architectures consist of large number of model parameters. It is computationally expensive when training such a model using a massive amount of data.
A monitor (11) which is a centralized and distributed manner was proposed for attack detection and identification in CPSs. Centralized monitors have been developed using geometric control theory methods, while the distributed detection monitors depended on distributed control technologies and parallel computation. However, this method has a high computation problem.
A data stream clustering algorithm (12) was proposed for pattern detection in CPS. This algorithm was based on a heuristic density-based model developed on the multiple species flocking. It used a local stochastic multi-agent search strategy which allowed agents to perform independently from each other and to interact only with immediate neighbors in an asynchronous way. However, it cannot detect patterns in CPS efficiently.
An Integrity Attack Diagnosis Systems (IADS) method (13) was proposed to identify integrity attacks in CPS automatically. The features of each attack in the spectral and wavelet domains were captured by designing a feature set. Furthermore, a new detection aspect was introduced to handle the previously unseen attacks in the network. However, the IADS method was highly time-consuming for building decision patterns to detect the attacks in CPS.
RNN (14) technique was introduced to detect and identify sensor attacks in CPS. In the pre-processing phase, the data collected from the sensor node was normalized and then it was utilized in the learning phase where the weights of the model were fine-tuned with the training data. According to the learning model, the RNN classified the sensor's states at the prediction phase. However, training of the RNN model was more difficult hence the detection and identification of sensor attacks in CPS are more difficult. A clustering based-algorithm (15) was proposed to handle multi-modal CPS data in a dynamic manner. However, this algorithm has a high computational complexity problem because of high dimensionality multi-modal CPS data.
An Embedding Clustering Based Deep Hypergraph (ECDHG) model (16) was proposed for online review analysis. The pretraining word embedding was applied to extract the external knowledge in the online reviews. Then, Hierarchical Fast Clustering (H-CFS) algorithm was used to identify semantic units under the control of semantic cliques. Furthermore, the high order semantic and textual features were extracted using Convolutional Neural Network (CNN). Finally, the hypergraph was built based on high-order relations of samples for sentiment classification of reviews. The hyperedge size was used in H-CFS which is obtained by a large number of experiments and it may not be optimal for some datasets.
A Fuzzy logic concept based CPS (17) was proposed for the detection of air pollution in urban areas. Initially, the data was routed between nodes using the routing protocols to monitor and control. Then the data was processed based on the Fuzzy logic concept for observing and validating the air control. It detected an extremely air polluting industry and region. However, the selection of appropriate membership function in the Fuzzy concept is more difficult.
A Genetic Algorithm-Support Vector Machine (GA-SVM) predictor (18) was proposed for agricultural CPS data. The agricultural CPS data was divided into a number of granules using Fuzzy granulation, Min-Median-Max granulation, and Quartile-Median granulation methods. The number of granules was processed in GA-SVM predictor where GA was used to optimize the SVM parameters which were used in SVM to predict the agricultural yield. However, it does not handle more complicated data.
A CPS framework (19) was proposed for measurement and analysis of physical activities. However, the communication aspect was largely in one-directional. A non-convex archetypal analysis was introduced (20) for one-class classification based anomaly detection in CPS. However, it has high computational complexity problem.A holistic Deep Neural Network-driven IoT smart health care method (21) was proposed for CPS data analysis. However, the computational complexity depends on the number of hidden layers of neural network. A distributed storage and computation k-Nearest Neighbor (kNN) based cloud-edge computing (22) was proposed for cyber-physical social systems. The convergence speed of this method depends on the initial clusters. A method (23) was proposed for event classification in cyber system. This method generated realistic synchrophasor data for the given synthetic network as well as event and bad data detection and classification algorithm. The anomalies were detected based on Bayesian and change-point techniques and multi-step clustering approach was used for event classification. This method will be expanded for classification models to distinguish the events in the network.
To decrease the model parameters in RC-IDeepRT, a parameter reduction method is proposed in this study. The main intention of the parameter reduction method is minimizing the total number of required neurons and synapses given a trained model. Due to the spatial execution of RC-IDeepRT architecture, the inference time is much less so specifically, hardware https://www.indjst.org/ usage needs to be considered. Instead of hardware usage, the number of parameters is considered because it is architectureindependent. The reduced parameters are used in RC-IDeepRT to classify the CPS data effectively. It reduces the computational expense and increases accuracy for CPS data classification. The whole process is named as IFSPT-RC-PRIDeepRT.

Proposed Methodology
In this section, the proposed RC-PRIDeepRT for CPS data classification is described in detail. The number of parameters of a neural network can also be reduced substantially by just zeroing out small parameters. This is equivalent to pruning connections, making the network sparsely connected. In deep learning, large and complex model architectures are employed, and the parameters are usually regularized. This gives more flexibility to learning algorithms, usually yielding better quality of results.
The regularization penalizes large parameters and the regularization leads to a sparse solution. Thus, the large number of parameters does not mean that the machine has learned that much knowledge, but rather it is a tool to obtain better quality results. Therefore, it is a natural clean-up process after training to truncate small parameters to zero if the size of the model is concerned. The pruning may not save the memory usage in traditional computers unless sparse matrices (or tensors) are supported and explicitly used. However, it directly reduces the usage of synapses in RC-IDeepRT architecture.
The RC-IDeepRT is a neuromorphic architecture that consists of neurons and synapses, which are processing elements in that architecture. A synapse stores a weight parameter and performs synaptic operations. Each feature is mapped on to neuron and they are connected through synapse. In the RC-IDeepRT, the features in the selected best feature space to flow from the input to output continuously. By using pruning and factorization techniques, the number of parameters of an RC-IDeepRT is reduced and it increases the classification accuracy. The overall process of the IFSPT-RC-PRIDeepRT method is shown in Figure 1 .

Parameter Reduction Using Matrix Factorization
The weight matrix W is generally heavily redundant and can be compressed linearly. A low-rank approximation through Singular Value Decomposition (SVD) is used to compressing the matrix, which factorizes the weight matrix. Even though variational Bayesian matrix factorization is more efficient, the SVD matrix factorization is simple and is efficient for the parameter reduction process. The SVD of W is given as follows: In Eq. (2.1),U ∈ R n×r , S ∈ R r×r and V ∈ R m×r and r denote the rank of W . A layer of n neurons fully connected to m neurons performs a non-linear transformation in IDeepRT. Since S can be merged into Uor V , Eq (2.1) can be rewritten as, In Eq. (2.2), W 1 ∈ R n×r , W 2 ∈ R m×r . The matrix factorization decomposes the original transform into two transforms which are given as follows: In Eq.

Reduces Small Parameters Using Pruning
By just zeroing out small parameters, the number of parameters of a neural network can also be minimized substantially. This is equivalent to pruning connections, making the network sparsely connected. In RC-IDeepRT, complex and large model architecture is employed and the parameters are generally regularized. It results in more flexibility with high CPS data classification accuracy. The l 2 regularization leads to sparse solution and l 2 regularization penalizes large parameters. The vast number of parameters however doesn't mean that the machine has learned too much, it is a device that provides a high degree of CPS data classification accuracy. This is also a normal clean-up after training, which reduces small parameters to zero if the model size is involved. Since sparse matrices or tensors are directly supported and used, the pruning process will save space in conventional computers. However, it directly reduces the usage of synapses in RC-IDeepRT architecture. A threshold value is considered in the pruning process to truncate the small parameters. If a parameter value is smaller than the user-specified threshold value, then truncate that value. However, the pruning process guarantees the CPS data classification accuracy. Thus, several parameters will be used as the source, sorted, and the larger parameters as many as the number are preserved.

Create a Sparsely Connected RC-IDeepRT Network by Integrating Factorization and Pruning
The depth in RC-IDeepRT is important because it allows creating a cost-effective network for CPS data classification. The integration of factorization and pruning allows creating a sparsely connected RC-IDeepRT network from a given trained network. It is cost-effective network. The main problem during integrating factorization and pruning is how to treat singular values. One simple solution is to assign https://www.indjst.org/ The original layer is factorized and to perform the ordinary pruning. Consider, η i, j represents the function that forces the i, j-th element zero. After that, since singular vectors are orthogonal, it is straightforward to show that In the above equations, ∥ · ∥ F is the Frobenius norm.Hence, when the matrix factorization is integrated with the pruning, each element of U and V is multiplied by the corresponding singular value when the elements are ranked. Eq. (2.8) and Eq. (2.9) are extended as, However, two elements are truncated each in Uand V , then

Removing Dangling Connections
The neuron is called dangling when it has zero input connections or zero output connections. The connections of a dangling neuron will be called dangling connections. Contrary to the low-rank approximation, the IFSPT-RC-PRIDeepRT can remove the dangling neurons since it prunes connections, not neurons. By counting the input and output connections of neurons, the dangling neuron can be identified. After finding the dangling neurons, the remaining connections of the dangling neurons are further pruned. The pruned network is cost-effective and it classifies the CPS data effectively.
Step 2 : Choose the feature space which has high rated SFI measure.
Step 3 : Develop DQN to combine reinforcement learning with a class of IDeepRT.
Step 4 : Reduce the number of parameters of RC-IDeepRT through SVD matrix factorization technique using Eq. (2.5).
Step 5 : Compress the weight matrix by applying a low-rank approximation technique.
Step 6 : Truncate the small parameters of the model using the pruning process.
Step 7 : Integrate the SVD matrix factorization and pruning techniques.
Step 8 : Remove the neuron which has zero input and output connections.
Step 9 : Process the IDeepRT for CPS data classification.
Step 10: Minimize the loss function concerning the parameters of IDeepRT.

Results and Discussion
Here, the efficiency of the Fuzzy Logic Concept (14) , GA-SVM (15) , IFSPT-RC-IDeepRT, and IFSPT-RC-PRIDeepRT based CPS data classification methods are analyzed in terms of Accuracy, Precision, Recall, and F_measure. For the experimental purpose Air Quality data and NSF CNS-1446640 datasets (5) are used. Figures 2 and 3 show the sample of Air Quality data and NSF CNS-1446640 datasets respectively. https://www.indjst.org/

Accuracy
It measures the ratio of a number of instances that are correctly classified over the total number of instances evaluated. It is calculated as Figure 4 shows the Accuracy of the existing and proposed methods for data classification in Air Quality data and NSF CNS-1446640 datasets. The accuracy of IFSPT-RC-PRIDeepRT is 13.33% greater than the Fuzzy Logic Concept, 9.56% greater than GA-SVM and 1.23% greater than IFSPT-RC-IDeepRT in Air Quality data. Similarly, the accuracy of IFSPT-RC-PRIDeepRT is 15.76% greater than Fuzzy Logic Concept, 13.10% greater than GA-SVM, and based 1.34% greater than IFSPT-RC-IDeepRT data classification in NSF CNS-1446640. From this analysis it is proved that the proposed IFSPT-RC-PRIDeepRT has high classification accuracy than the other method for CPS data classification in Air Quality and NSF CNS-1446640 datasets. https://www.indjst.org/

Precision
It is used to determine the positive patterns that are correctly classified from the total classified patterns in a positive class. It is calculated as Figure 5, we observed that the Precision of the IFSPT-RC-PRIDeepRT method is 12.16%, 8.46% and 1.33% greater than the Fuzzy Logic Concept, GA-SVM, and IFSPT-RC-IDeepRT methods for Air Quality data respectively. For the NSF CNS-1446640 dataset, the precision of the IFSPT-RC-PRIDeepRT method is 17.38%, 10.79%, and 1.54% greater than the Fuzzy Logic Concept, GA-SVM, and IFSPT-RC-IDeepRT methods, respectively. Hence, it is proved that the proposed IFSPT-RC-PRIDeepRT method has given better results in terms of precision when compared to the other methods. https://www.indjst.org/

Recall
It is used to determine the ratio of positive patterns that are correctly classified. It is calculated as The performance comparison between the existing and proposed methods in terms of recall is shown in Figure 6. From the experimental results, the recall of IFSPT-RC-PRIDeepRT is 15.88% greater than the Fuzzy Logic Concept, 7.07% greater than GA-SVM, and 1.34% greater than IFSPT-RC-IDeepRT method for the air quality dataset. For the NSF CNS-1446640 dataset, the recall of IFSPT-RC-PRIDeepRT is 19.76%, 12.87%, and 1.24% greater than the Fuzzy Logic Concept, GA-SVM, and IFSPT-RC-IDeepRT methods, respectively. From this comparison, it is proved that the IFSPT-RC-PRIDeepRT has high recall than the other methods to classify the CPS data. https://www.indjst.org/

F_Measure
It is the harmonic mean between recall and precision values. It is calculated as In Figure 7, the effectiveness of the IFSPT-RC-PRIDeepRT method is analyzed in terms of F_Measure by comparing it with the existing methods. The F_Measure of IFSPT-RC-PRIDeepRT is 16.07%, 9.55%, and 0.83% greater than RNN, OCS, and IFSPT-IDeepRT methods for air quality data respectively. Similarly, for the NSF CNS-1446640 dataset, the F_Measure of IFSPT-RC-PRIDeepRT is 18.41%, 14.25%, and 0.62% greater than Fuzzy Logic Concept, GA-SVM, and IFSPT-RC-IDeepRT methods, respectively. From this analysis, it is proved that the proposed IFSPT-RC-IDeepRT method gives better results in terms of F_Measure when compared to the other methods. https://www.indjst.org/

Conclusion
The importance of this study is to enhance the performance of CPS data classification by reducing the number of parameters and compressing several layers of IDeepRT. Initially, the best feature space is selected based on the IFSPT. The features in the selected feature space are processed in the RC-PRIDeepRT to classify the CPS data. The reinforcement learning corrects the feature space based on the prior knowledge of the features. The several layers are compressed using a compressing scheme. A parameter reduction method is introduced to decrease the model parameter. The parameter reduction method combines the factorization and pruning process which allows creating a cost-effective network. By reducing the parameters, the RC-PRIDeepRT method enhances accuracy, and reduces the error rate. The experimental results prove that the proposed IFSPT-RC-PRIDeepRT method has high Accuracy, Precision, Recall, and F_Measure than the Fuzzy Logic Concept, GA-SVM, and IFSPT-RC-IDeepRT for CPS data classification.