Deep Learning with Particle Swarm Based Hyper Parameter Tuning Based Crop Recommendation for Better Crop Yield for Precision Agriculture

The common diﬃculty present among the Indian farmers is that they don’t opt for the proper crop based on their soil necessities. Because of this productivity is aﬀected. This problem of the farmers has been solved through precision agriculture. This method is characterized by a crop database collected from the farm, crop provided by agricultural experts and given to recommendation system it will use the collect data and do deep learning model as learners to recommend a crop for site speciﬁc parameter with high accuracy and eﬃciency. Objectives : The comprehensive objective of the model is to deliver direct advisory services to even the smallest farmer at the level of his/her smallest plot of crop, using the most accessible technologies using deep learning. It is a recommender model built using a classiﬁer and an optimization of the classiﬁer. Based on appropriate parameters, the prediction accuracy. Novelty : MDNN can produce to simpler models while working with weights that can be optimized through PSO. PSO-MDNN sub-models produced from datasets and this proposed models gets adapted to data patterns if featuring datasets. This proposed work also exhibits promising outcomes in predicting crop yields using DLTs. for classifying images build a crop yield prediction model based on NDVI (New band Index)/RGB (Red Green Blue) data from UAVs (Unmanned Aerial Vehicles). The study evaluated the results of CNN by varying its parameters like training algorithm, network depth/layers, regularizations and tuning of hyper parameters.


Introduction
In recent times, forecasting of crop productivity at the within-field level has increased. The most influencing factor for crop productivity is weather conditions. If the weather based prediction is made more precise, then farmers can be alerted well in advance so that the major loss can be mitigated and would be helpful for economic growth. The prediction will also aid the farmers to make decisions such as the choice of alternative crops or to discard a crop at an early stage in case of critical situations. Further, predicting crop yield can facilitate the farmers to have a better vision on cultivation of seasonal crop and its scheduling (1) .
Achieving maximum yield rate of crop using limited land resource is a goal of agricultural planning in an agro-based country. Crop selector could be applicable for minimize losses when unfavorable conditions may occur and this selector could be used to maximize crop yield rate when potential exists for favorable growing conditions (2) . Thus, it is necessary to simulate & predict the crop yield before cultivation for efficient crop management and expected outcome. As there exists a nonlinear relationship between crop yield and the factors influencing crop, machine learning techniques might be efficient for yield predictions.
More recently, machine learning techniques have been applied for crop yield prediction, including multivariate regression, decision tree, association rule mining, and artificial neural networks. A salient feature of machine learning models is that they treat the output of crop yield as an implicit function of the input variables, which could be a highly non-linear and complex function. Compared with the aforementioned machine learning models in the literature (3) , which deep learning methods with multiple hidden layers are more powerful to reveal the fundamental non-linear relationship between input and response variables in the field of crop recommendation, but they also require more advanced hardware and optimization techniques to train.
Several factors determine crop yields like UVL (ultraviolet light), water, pesticides, land used for cultivation and fertilizers sprayed on crops. The study in (4) used ANN (Artificial Neural Network) and evaluated agricultural datasets from which 140 attributes were used to predict crop yields. CNNs (Convolutional Neural Network) were used in (5) for classifying images and build a crop yield prediction model based on NDVI (New Visible band Index)/RGB (Red Green Blue) data from UAVs (Unmanned Aerial Vehicles). The study evaluated the results of CNN by varying its parameters like training algorithm, network depth/layers, regularizations and tuning of hyper parameters.
The study in (6) examined the relationship between MLR (Multiple Linear Regression) and ANN by proposing their MLR-ANN model. Their proposed hybrid model applied MLR intercept and coefficients to initialize ANN's weights and bias in the input layer and efficiently predicted crop yields. The study in (7) considered unpredictable climatic changes and scarce water resources for their predictions on crop yields. The study gathered, stored and analyzed agricultural and climatic changes data which were then explored by DM (Data Mining) techniques. The study estimated crop yields, selected the most suitable crop, and demonstrated the utility of DM techniques for farmers' cultivable areas.
Loss caused to farmers by grass grub insect was studied in (8) . The extent of crop damage was assessed by known classifiers like DTs (Decision Tree), RF (Random Forest), NB (Naïve Bayes), SVM (Support Vector Machines) and KNN (K-Nearest Neighbors) (9) . Pouteau et al. (10) compared 6 machine learning algorithms (SVM, Naïve Bayes, C4.5, RF, Boosted Regression Tree, and kNN) with 6 satellite data sets from different sensors (Landsat-7 ETM+, SPOT, AirSAR, TerraSAR-X, Quickbird, and WorldView-2) for Topical Ecosystems Classification and stated that kNN better performs for the Landsat-7 ETM+ classification. Most recently, Heydari and Mountrakis (11) studied the effects of the classifier selection, reference sample size, reference class distribution, and scene heterogeneity in per-pixel classification accuracy using 26 Landsat sites with five classification algorithms (Naïve Bayesian, kNN, SVM, Tree ensemble, and Artificial Neural Network).
They found NNs (Neural Networks) and RF performed slightly better than other classifiers. The study also said that Ensemble models can improve the outputs of weak classifiers for agriculture related issues. DNNs achieved excellent performances in the agricultural domain. However, one major issue with DNNs is their dependency on hyper-parameters which can be avoided for improving their result's effectiveness. Previously proposed architecture for predicting crop yields, are often manually-designed where deep learning technique experts investigate issues. They fail to design optimal architectures as they do not or may not understand agriculture. Hence, this study has proposed an automated model for crop yield predictions. The study aiming to overcome existing challenges, pre-processed meteorological data and fed into the proposed PSO-MDNN framework for extracting intricate spatial and temporal features. Thus, it is evident from literature that the proposed novel deep learning method can be used for efficient yield estimation of crops.
The main contribution of the work is given as follows: • Initially, the mass of historical crop production data and climate data is gathered and is made to data pre-processing work. • Then the prediction model using Modified Deep Neural Network is utilized for crop recommendation and named as PSO-MDNN. Particle Swarm Optimization algorithm (PSO) is adopted to determine the optimal architecture of MDNN and produce better crop recommendation results.

Proposed Methodology
History of crop productions and climatic changes are used in pre-processing in this work which are then trained by MDNN and optimized by PSO. The use of PSO is to determine the optimal architecture of DNN sub-models for each group of datasets i.e. the choice of DNN's hidden layers, neuron count in the hidden layers, activation function and training approaches of the predictive model. DNNs are used due to their recent popularity in classification tasks. DNN performances are directly related to their architectures which are framed manually by experts in the network.

Dataset collection
This work's data was accumulated from https://www.timeanddate.com/weather/india/new-delhi/histori c, crop yields from faostat3.fao.org and the Dataset from https://data.world/thatzprem/agriculture-india (four years ago updated). The fetched data contained many variables signifying rainfall intensity of regions and their crop production yields. History of climatic data, a CSV (Comma Separated Values) file had yearly data in rows while crop production data CSV row depicted yearly crop yields.

Data pre-processing
The combined and generated dataset resulted in multiple parameters from which unimportant parameters were ignored as a part of pre-processing. Thus, disparate historical records were combined after eliminating unwanted parameters for effectiveness of predictions in this study. The parameters Maximum (highest), Minimum (lowest) and Average Temperatures (temperature average of a region/month), Precipitation Humidity, Dew Points and Wind Pressure for every month were used. The dataset was split as 60/40 for training and testing. Classifiers was trained on test data and for creation of models generated which were then evaluated for prediction accuracy on test data.

PSO-MDNN Data classification
DNNs use multiple layers in succession where the output of one layer becomes the input of the next layer. The count of the layers used is the depth of a DNN. DNNs also use fuzzy logic in certain layers called hidden layers. DNNs are very popular as they select features that are most suitable before classifying them. Figure 1 depicts DNN. The proposed MDNN classifies pre-processed data into suitable and non-suitable crops. The decision regarding every INPUT node is taken by MDNN and then sent to the next tier which are then trained (12) . Pre-processed input data D is trained by DNN using input, hidden and output layers. The network uses many hidden layers between inputs and outputs. Input layer elements or data points are assigned weights and then processed by subsequent layers. DNN assigns weights in each layer while processing till the final output is obtained. An error function E j is used to compute the correctness of learning and in the case of MDNN the crops. This learning cycle is cycle is repeated until DNN assesses the relationship between inputs and extracted data thus resulting in a set of learned classes based on which recommendation can be done i.e. crops are recommended. A probability function, f (x) = x, is used on the input neurons also defined as a sigmoid transfer function is depicted in Equation (1) f Where, for each output node j's error is defined as Equation (2) Where, targ p j -output for the p th observation and out p j -its true output. DNN model's flexibility may result in overfitting when trained samples are not in large numbers and hence in this study, this issue is overcome with L2 regularization that involves changes to the error function. It is a decrement of weight towards but not exactly zero and is depicted in Equation (3) Where,λ -regularization parameter and m-the count of features. This hyperparameter's value is optimized using PSO in this work. Every input feature's weight W = {w 1 , w 2 , . . . , w n } is computed by a fuzzy membership function given in Equation (4) Where, input nodes are linguistic labels,(x) -membership functions where highest and lowest values are 1 and 0. The weighted sum of the inputs obtained by an add function (adder)is shown as Equation (5) MDNN output can be mathematically formulated as Equation (6) Where, y-output neurons, f (x)-the transfer function, w i -weight and x i -input and b i -bias. Output neurons learn the relationship between countermeasures and parameters iand thus are able to learn and classify crops. MDNN algorithm steps are listed is as Table 1.
For(each node j in output layer) the error function is calculated as //targ p j is the desired target output for the p-th observation and the out p j is the actual output for the p-th observation. For(each node j in output layer) the L2 regularization out regularized is calculated as Return the class label results with crop types.
MDNN algorithm depicted above initially input with pre-processed attribute vectors, thou output of the first layer. The inputs are processed iteratively where every iteration is a matrix multiplication with the weighted matrix w and bias b is also added. The hidden layers get applied with the sigmoid function and the resultant matrix is fed to the subsequent layer. MDNN's values are updated based with w and b based on errors. The final output from MDNN is probability of crops which are classified into labels for forecasting.
PSO-MDNN classifier: This work uses PSO to select optimal hyperparameters including neurons in layers including the hidden layers, learning levels, initial neuron weightsand momentum/speed of MDNN. This apply overall application of PSO is for improving precision of crop recommendations. PSO simulates the behaviour of birds on flight. The flock communicate with each other in flights while keeping an eye on the bird in the best position to follow its direction. The flock inspects search spaces for new locations until the designated destination is reached. In the process, the flock exhibits intelligent social behaviour. Thus, the flock manages to discover routes using personal experience (local search) and flock members experience (global search). If formulated mathematically then N is collection of random particles, i th particle is denoted by its position in a S-dimensional space with S variables. Each particle i observes three values: A i ={a i1 , a i2 , . . . , a iS } -current position; Ps i ={ps i1 , ps i2 , . . . , ps iS }-best position of previous cycle and Ve i ={(ve i1 , ve i2 , . . . , ve iS }-flying velocity. In each time cycle (interval), position (P bg ) of the best particle (bg) is the best fitness for particles. Hence, each particle changes its velocity Ve i to move closer to the best particle bg as depicted in Equation (7) New Where, pc 1 and pc 2 -positive constants/learning factors and pc 1 = pc 2 = 2), rand ( ) and Rand ( )functions for random value generations in the range [0, 1] Particle updates its position using the new velocity Ve i based on Equation (8) New position A i = current position ew Ve i (8) Where,Ve max ≥ Ve i ≥ −Ve max . Ve max is the upper limit of velocity change of a particle, ωinertia weight applied current velocity to manage previous velocity history enhancement. ω balances global/local searches while reducing linearly with time in the range 1.4-0.5. Global searches are initiated with a large weight and reduce with time intervals to accommodate local searches. From equation (8), the value of Ve indicates private judgment of particles in comparison to current positions and its best positions. Ve i denotes social collaborations where the best positions and current positions are compared. f the best particle. Moreover, particle velocity changes are controlled by Ve max , user-specified. Once the new position is updated from (8), the particle moves towards it. Thus, PSO's main parameters (13) are population size (number of birds); generations count, ω and Ve max d. Input crop data, N or k is checked for fitness by the function and the best position pBest is determined. When fitness (i) is better thanpBest, pBest(i) = f itness (i)implies the required condition is not satisfied and executions are terminated, resulting in final classifications. pbest implies a particle's personal best location while gbest implies its best global location. The pseudo-code of PSO algorithm is detailed below: • Set Initial MDNN's hyper parameters λ , neuron count, hidden layer count, weight and bias. P(Population) of hyper parameters is generated from N Dataset such thay. HP i = (P 1 , P 2 , P 3 , . . . . . . .., P N ) • Evaluate every particle's position based on the minimization objective function including its fitness • Cycle count =1 • iterate • Update particle velocities based on Equation (9) Ve i (t) = Ve i (t − 1) +C i r i (pbest (t) − a i (t − 1)) +C 2 r 2 (gbest (t) − a i (t − 1)) • Where, c is an acceleration value and r is a randomized value in the interval [0,1] • Check if velocity is within the range Ve max ≤ Ve i ≤ Ve min • Assign new position to particles based on A i (t) = A i (t − 1) + Ve i (t) • Monitor values do not exceed defined limits • Assess fitness values of particles with prior best values. When current value is better assign current value to its pbest and current location to pbest location in N. • Check if current fitness value with gbest, if better assign it to gbest • Check if termination point is reached, if yes stop else go to step (8).

Results and Discussion
The implementation results of the proposed system PSO-MCNN is displayed as figures and tables as and when required. Moreover, the proposed technique is benchmarked with existing research techniques (10) Recall: The recall of a classifier represents the positive correctly classified crops to the total number of positive samples, and it is estimated as follows: F-measure: this is also called F 1 -score, and it represents the harmonic mean of precision and recall as follows: Accuracy: It is one of the most commonly used measures for the classification performance, and it is defined as a ratio between the correctly classified crops to the total number of crops as follows: A confusion matrix was created as a part of evaluations and depicted in Figure 3, while Table 2 lists comparative performances of classifiers with the proposed PSO-MCNN.  Table 2 shows the performance comparison of the proposed and the existing approaches using various metrics.

Conclusion
Finally, new deep learning-based approach for predicting crop yields has been proposed and named as PSO-MDNN recommender model is found to be effective in recommending a suitable crop. Most importantly, this work is a step ahead as it predicts crop yields. The work has also demonstrated its accuracy value by outperforming other techniques in its benchmarks such as Dec-Tree, KNN, R-forest, Neu-Net. Further, the proposed model has successfully predicted yields in untested environments; thus, it could be used in future yield predictions of crops with accuracy value of 95.49%. This work's PSO-MDNN was trained using back propagation to improve classification accuracy. This method could be extended with a hybrid model as either low-yielding or high-yielding based on its relative performance against other hybrids at the same application of producing crop recommendation.