Implementation of Genetic Algorithm for School Monitoring System for Matiari District, Sindh, Pakistan

Background: This study focuses on Optimization field of computer science which works well when human beings are unable to solve the problem of finding an efficient path. Methods: The in-depth analysis of the problem has followed by the calculation of the school distances. The detailed route formation has been done so that the distances can be actualized in simulation-based optimization. The optimization is based on Genetic Algorithm and for this purpose a graphical user interface has been developed. Findings: The school monitoring teams were given an optimized path for their route to visit where they can easily save the time and the fuel. Results show that the genetic algorithm has produced an efficient and accurate path for the monitoring team so that the fuel and time can be saved. Improvements/Applications: The current study can be extended by comparing genetic algorithm output to any other optimization algorithms, so that the efficiency and time can be measured. *Author for correspondence


Introduction
There are number of the jobs which are tedious for human beings whereas it is easy for computers and machines, especially the situations where much of the calculations are needed 1 . Optimization algorithms are one of the examples which can cover the calculations that are nearly impossible for human beings. Many of the schools in Province Sindh of Pakistan are operating and many of the teachers remain absent from their duties. One of the major reasons behind it is the political recruitment of staff. The teachers at primary and middle schools across the Sindh province remain absent mostly (as per local newspapers) from their schools and the students; future is at its risk. Recently

Genetic Algorithm (GA)
Professor John Holland from USA proposed the Genetic Algorithm (GA) in computer science and Artificial Intelligence (AI) in 1975. This algorithm mimics the process of natural selection of biology into computer science. GA finds out the optimized solutions of bigger classes of evolutionary classes of algorithms 10 . Various fields are enriched with GA including medicine, plan-ning, technology scheduling, and biology, engineering and others 11,12 . Many of the variations of GA are employed in real time problems and other intelligent systems 13,14 . The GA works on the genetic operators namely fitness function, mutation, crossover, selection and population. A simple and working diagram of GA is illustrated in Figure 2.

Population
This operation considers each and every state, node or any other identifier as a member of population. All of the possible solutions are inside the set of a population. Population is the foremost operator which considers each and every node (all sensor nodes). The population is the set of all possible solutions. Every solution can be termed as the individuals /chromosomes. It is usually randomly selected chromosomes. The population represents in the form of array of genes/alleles or string. It is worth to mention that the length of each population or string should Figure 1. Artificial intelligence algorithms be the same. A fitness value is assigned to each and every individual. The fitness value is the result of fitness function of GA; furthermore a fitness value is the indefinite specific value for each individual.

Selection
The other operator is the selection used with GA and the basic job is to reproduce the next generation node with energy. It selects the desired place or the next node which is more desirable whereas avoiding the rest of the nodes or places having less interest. The next desirable or favourite nodes or place is selected and the method is also called reproduction. The most desirable node or place has the better chance to go for the next level. A good selection strategy or a good selection may pro-vide fitness evaluation more efficiently and will go for the optimized solution. The other side of the coin of selecting improper strategy may lead to perform GA in a wrong direction and may resist the success of overall algorithm.

Fitness function
The optimization and its suitableness of the solution of GA is checked and for the understanding it is used. Every place or location is checked against the other ones and checked for the optimality of the solution and that is the result of the fitness function. The more suitable function has the tendency to produce an optimal path during the next pathways.

Crossover
The reproduction or crossover is one of the important steps of GA in which an offspring is produced from two parents and let them changeover their data between each other and producing succeeding children 15,16 . The crossover operator helps to create a better type of the offspring from the selected candidates of parents. A good implication can generate a better individual. The generation of the new child is shown in Figure 3. The child is inherited or created from the Parent1 and Parent2.

Mutation operator
This operator or stage can be used after the crossover operator 17 . This operator is also used for premature merging of the Genetic Algorithm. This operator is used to search or the new optimized local solution or possible solution from available places or nodes or solutions rather than to be search for the existing one. It checks for the new solution from the entire place.

Research Methodology
The current study has been performed on MATLAB software and the results are produced based on the simulation containing various scenarios. The Genetic Algorithm (GA) has been employed to obtain the optimization. The focus will be on the optimized route selection for the efficient path of the visiting team.
In proposed study, the multiple union counsels are placed on the artificial map and the schools are then monitored. The schools are drawn on the map as a try to approximate the locations of the schools. The monitoring teams are shown as the links between the two schools. The calculation is made on the distance matrix calculated on a real data collected from government if permissible.

Proposed Research Framework
The main consideration started with the understanding of problem and its possible solutions. The thorough examination of the problem has been made at the start. Then the problem was formulated on the basis of original data. The original map and other data have been achieved. Then the locations were plotted on the field or simulation graph. The software has been used for the simulation is MATLAB 2017. There is another option available for the manual placement of schools on the ground or graph where the locations of the schools can be plotted manually on anywhere on the map. The number of schools or locations can be understood as the visiting locations during the monitoring process for the employees. The plotted locations are then given as input to the GA to select any location for the start and then with the help of GA, the optimized path or efficient path is provided so that the fuel and time of the monitoring staff can be saved. The GA finds next path with an additional quality that the next path is the best path or most optimized state or location from the available solutions. The GA will provide the best path as a local and as a global and calculate the distances from every location to other location and finally calculate the minimum distance of overall tour for the objective of the efficient monitoring by saving the fuel and time. The proposed model has following steps (see Figure 4).

Phase 1: In-depth Analysis of School Monitoring
System During this phase and in-depth analysis of system has been provided based on the school monitoring system formulated by Government of Sindh, education and literacy department for the monitoring of their schools and the formation of monitoring officers and teams. The monitoring system rules and regulations, monitoring mechanisms, devices used and some other parameters has been presented in this study.

Phase 2: Study Based on Calculations of Schools and
Distances During this phase the requirements have been defined and analysis of schools located and their distances have been calculated against given data and district data in In-depth analysis of school monitoring system

Study based on calculation of schools and distances
Monitoring route detail formulation against district data Implementation of Genetic Algorithm to find optimal path GUI based Application development order to formulate the problem and to provide optimal solution for the shortest path so that the school monitoring officers and teams have to visit in order to save fuel and time.

Phase 3: Monitoring Route Detail Formulation
against District Data In this stage, the routes have been identified, the distances have been analyzed and the timing of start and end of the tour will be calculated. The visiting mechanism, the repetition avoidance and many other factors have been analyzed and defined in this stage.

Phase 4: Implementation of GA to Find Optimal Path
During this crucial phase GA have been applied with its all stages, step by step in order to achieve an optimal path among the schools located at various places available in the district data. This optimal path (shortest route) have been proved to be the shortest path during a particular day of school monitoring officers or teams visit in order to save their fuel and time. The visits can also be scheduled efficiently.

Phase 5: GUI based Application development
At this final stage the application is developed for the above stages and graphical representation of total distance and optimal path has been given so that the problem can be formulated and solved visually. For the implementation of GA MATLAB 2017 has been used.

Proposed Research Scenario
The proposed model states the locations as a dot, circle or star or node on the plane, a total of 12schools of Matiari district have been added on the ground. This is for the example purpose as there are various combinations of schools are placed on the ground and their optimal paths have been calculated. The ground has been set as the 100 x 100 points and the location of the schools is placed as circle for the sake of understanding. The x-coordinates and y-coordinates are labelled to understand the plane of the experiments. The visit of monitoring team may start from any point or location of the school and may end to cover all locations in the district or points on the plane.
Another scenario has been created for the testing purpose of the algorithm that if the proposed framework is working or not. For the purpose of testing various planes along with varying number of schools plotted so that the efficiency of GA can be tested. The proposed algorithm has been tested and proved efficient for the small number of schools as well as increasing number of schools as shown in Figure 5.

Proposed Optimization Algorithm
The GA has been modified to suit the conditions so that the optimization can be achieved applying starting parameters with the N number of schools or circles on the plane. Number of repetitions has been represented by and used the initial state of the tour. The roads or the number of schools are thought of a network and called as fully connected network.

5:
=0 ← Set initial value of path to zero per created network.

11:
ℝ ← Select the best route using from ℝ for end-to-end transmission of locations.

12:
ℝ ← Copy ℝ to a final array for each iteration of .

14:
return each to an array for averaging times for given .
Additionally, the distance calculation can also be done diagonally. In this algorithm, on line 5, the value of distance is made for selecting the most optimal route to the neighbouring school or circle and the network is created by setting its value to 0. On line 6, the created route is globally known to all schools in Matiari district. The maximum paths have been created and from them the best ones are reserved for the communication as on the line 8 and 9.

Subject/REGION/Context
The experiments and simulation-based results are produced using MATLAB 2017a. The routes or route information has been optimized for the monitoring team which may consume more fuel and waste much of the time but with the deployment of the GA the routes are optimized and the system can generate easy paths or the shortest paths for the monitoring teams so that they can perform the job within short time by saving the fuel. The data of the schools collected from internet and the old district profile for Matiari district which helped in getting the exact estimation about the number of schools and locations of schools in Matiari district. The distance has been measured and estimated on the information collected from internet. The deployment of school identifiers has been done by placing circles as a sign on 10 x 10 Km 2 area. The area of plotting can be changed accordingly if there is a variation in the size of the experimental area or the location of the schools. The proposed approach is equally useful for the both scenarios. The starting point maybe the different and the monitoring visit can be started from any point. The minimum distance is explored by GA which is a well-known and popular approach for finding the optimized paths in number of problems. For the purpose of simulation, the paths have been determined by GA and the schools have been located with the help of google maps and other available sources so that the schools can be places on the locations where they are located as shown in Figure 6 and gender wise distribution of schools can be seen in Figure 7. The simulations have

Simulation System Model
The system contains various scenarios, the various situations are encompassing to form different types of monitoring teams for various types of schools such as one team is responsible for primary schools for boys and the other is reserved for primary schools for girls. The other teams have been formed to monitor mixed schools as the number of schools is too many so there is a need of forming more teams. Following sections are the proposed scenarios for various teams and schools.

Simulation Parameters
The main parameters have been extracted and utilized in our study and the GA parameters are utilized from original available GA. The modified and selected parameters are shown in Table 2. The Original parameters have

Parameters Symbol
Starting with initial search for current population ℎ

Initial Population Π0
A satisfied number of Optimized Paths

Strongest candidates ℎ
The selection on the basis of binary tournament -1 and -2 Selected strong candidates, which are ready for Crossover ℎ The candidate nodes Selected by Crossover ℎ+ ′ℎ= ℎ

Loops removing ℒℎ
New population by Elitism method ℎ+ ℒℎ=Πℎ+ 1 Table 2. Parameters used in GA flow chart been merged with the selected parameters. The mutation and the cross over properties are also illustrated in Table  2. The selection of an individual has been based on the energy levels which show the fitness level of the value. The solution is based on the fitness function which has been already defined.

Boys School Distance Optimization Results
The system bifurcates various school systems and then optimize for the routes for various teams as discussed earlier there are more than 900 schools and it is near to impossible to form one team for such a number in this connection, the number of teams have been formed so that the monitoring process can be done efficiently. The system first introduces the plotting of the number of schools and then optimizes the routes for the sake of fuel saves and time saving. Various experiments have been performed and the optimized distance has been obtained. The distances and the number of schools for boys' locations have been represented in Tables     boys in Matiari district so there was no any experiment set for this type of data. There are two higher secondary schools for boys and there is no need for the optimization as there is only one distance between these two. Figure 8 shows the final optimized path for the monitoring teams,a total of 831.48 kilometres has been the final distance to visit 98 schools for boys in Matiari district.

Girls School Distance Optimization Results
There are 134 schools for girls in district Matiari out of which 118 are the primary schools, 4 middle schools and 12 secondary schools for girls. There is no elementary or higher secondary school for girls in Matiari district. The following sections introduce the plotting of the number of schools and then optimize the routes for the sake of fuel saves and time saving. For the sake of testing mechanism of algorithm that it is working on large amount of data the GA has been tested on overall school. The overall path has been optimized and the results show that the GA works fine with the same proportion even in the case when the nodes or locations or number of points are increased. Figure 9 illustrates the final result of optimized result of 134 overall schools of girls in Matiari district.

Overall Boys and Girls Schools Distance Optimization Results
The experiments were performed to form a new set of experiments on the basis of mixed or total number of schools located in Matiari district. To make experiments more versatile some of the new experiments were added to the study. For the sake of testing mechanism of algorithm that it is working on large amount of data the GA has been tested on overall school. The overall path has been optimized and the results show that the GA works fine with the same proportion even in the case when the nodes or locations or number of points are increased. Figure 10 illustrates the final result of optimized result of 232 overall schools of both boys and girls in Matiari district.

Conclusion
Optimization is the field of computer science which works well when a human being is unable to solve the problem of finding an efficient path of a particular problem. Many of the schools in Sindh province face the absenteeism of teachers and they remain absent in their schools. For the sake of this problem Government of Sindh education and literacy department has recently started a monitoring program to monitor the teacher's attendance so that the  School monitoring officers or teams have to visit certain schools located at various villages and union councils of district Matiari. These schools have differentiating distances among them, so to optimize the easiest path, a short route is provided so that the time and fuel can be saved. The solution is very difficult and time consuming for human being if calculated manually, so for the sake solution, an integrated application is developed which will provide the optimal path or shortest path for the school monitoring officers or teams of Sindh education and literacy department so that school monitoring process can be made easy and a much fuel can be saved.
For the sake of solution of the above problem the indepth analysis of the problem has been the first phase followed by the calculation of the school distances. The detailed route formation has been done so that the distances can be actualized in simulation-based optimization. The optimization is based on GA and for this purpose a graphical user interface has been developed.
The optimization process has been performed with the help of GA by placing the school as a point of optimization in a x and y plane. The school system along with classification of the schools and the facts has been extracted or collected from the Sindh Education Management Information System (2013-2014). A total of 926 schools are under the consideration of the optimization. The single team is not possible to monitor for such number of schools. So, the monitoring teams have been divided according to the number of schools. Various experiments have been performed so that the actual idea of the monitoring team for the saving of fuel and time can be formulated.
The schools were bifurcated as schools for boys and school for girls. The schools of girls as primary and secondary and the same schools for boys. The school monitoring teams were given an optimized path for their route to visit where they can easily save the time and the fuel. Results show that the GA has produced an efficient and accurate path for the monitoring team so that the fuel and time can be saved.

Future Work
The current study can be extended by comparing GA output to any other optimization algorithms, so that the efficiency can be measured. The study can also be employed on the data of other 21 districts of the Sindh province or even wider scope can be achieved.