Exploiting the Local Optima in Genetic Algorithm using Tabu Search

Objectives: To explores the process of selecting retrieval schemes along with their weights, and fusion function for data fusion in information retrieval. Methods/Statistical Analysis: This has been carried out using the hybrid Genetic Algorithm. The fusion function, retrieval schemes and their weights lead to a tremendous combination. Finding an optimal solution from this great combination is entirely based on the exploration. Findings: We used, odd and even point crossover as an exploration tool. This exploration tool suffers a setback of slow convergence. The convergence rate can be improved by merging Tabu search, a best local search, with the genetic algorithm. This Tabu GA is used to select the retrieval schemes, weights and fusion function. The outcome of the experiments conducted over the test data sets namely: 1. adi, 2. cisi, and 3. cranlooks promising. We achieved 6.89% of improvement in performance, and the significance of the result is tested statistically. The convergence rate is also improved. Application/Improvements: We achieved 6.89% of improvement in performance, and the significance of the result is tested statistically. The convergence rate is also improved. *Author for correspondence


Introduction
Information Retrieval (IR) is the process of finding relevant information from the massive volume of data 1-3 . The IR system process, arrange, store, and proffer the relevant items based on the users' query. The correlation between the document and the query is calculated using various similarity measures 4 . The performance of the IR system is varying from one corpus to other [5][6] . Fusion is used to overcome this drawbacks [7][8] .
Data fusion is a process merging results from more than one resources 9 . It combines the results from various retrieval schemes and strategies. The fusion function converts the multidimensional vector into a scalar 10 .
The conversion of the vector to a scalar is varying from one function to another function. Hence, the problem becomes multi-objective. The optimization tools are useful for solving the multi objective issues 11 . GA is the best among these global optimization tools 12 . It has the bioinspired operators such as reproduction, crossover and mutations for finding the optimal solution.
In our problem, we use GA for finding the best fusion function, retrieval schemes, and their weights. The problem demands more exploration, and we proposed methodology a new exploration tool called as 'odd and even point crossover'. The exploration suffers a drawback of slow convergence, and it can be overcome by combining the best local search 13 algorithm with the GA. Tabu 14 Keywords: Genetic Algorithm, Information Retrieval, Odd and Even Point Crossover, Tabu GA, Tabu Search is one of the best local optimization tools. The Tabu GA is used to find the optimal solution for our problem.
The rest of the study is organized as follows. Section 2 gives the GA based selection and its results. Section 3 analyzes the convergence of GA based selection. The Tabu GA based selection is given in Section 4, and its convergence analysis, comparison with the GA based selection is given in section 5. Section 6 concludes with the future direction of our research.

Proposed Method
Fusion combines the advantages of all participating members 15,16 . In IR, fusion becomes a multi-objective optimization. It has to select the best fusion function, best retrieval schemes and the optimal weights for the retrieval schemes.
We used GA for fulfilling our need. As the optimization involves more than one goal and parameters, we have to explore the search space. 'Odd and Even Point crossover' is used for this purpose, and it has been recorded in our previous work 17 . The experiments are conducted over three data sets namely adi, cisi, and cran. The characteristics of the data sets are given in Table 1.
The fusion function used in our experiments is COMB MNZ, COMB MAX, COMB MIN and COMB SUM 9 . The retrieval schemes used are: 1. dice, and 2. inner product and conjunctive normal form with P value as 1.5 and 2.5. The GA used here is to find the best fusion function, best retrieval schemes and the optimal weight for the selected schemes. Fitness Function used in our experiment is given below. Average 11 pt interpolated precision is used as the fitness function it is given by equation 1. n ∑ ave Ps(qi) The gene position is divided into the odd and even location. We have a P odd value. It is ranging from 0 to 1. If the randomly generated value is less than 0.5, then the crossover is carried out on odd location. If it greater than 0.5, then cross over carried out on even location. The experiments are carried out 50 times with the following GA parameter given in Table 2.The results obtained for the above parameter setting is given in Table 3. There is an improvement of 6.28% in the performance. It seems that the convergence is low and the variation is high. We want to test the convergence, and it is given in the next section.

Selection of Retrieval Scheme Using Ga
The odd and even point crossover is used as the robust exploration tool in comparison with the other existing crossover operators 18 . It successfully explored the search space. The algorithm and the results obtained using this operator is discussed in the previous section.
The previous section proffers the IR system performance and the tool properties are not discussed. This section is used to analyse the convergence property of the exploration tool namely odd and even point crossover. The selection problem for the data fusion in information retrieval has too many combinations. All the combinations ought to be explored. This need leads to the frequent disturbance in the gene structure and results in slow con-

Average Fitness Value of the Individual
The average value of the individuals present in each generation is recorded to measure the convergence property. This is continued for 100 generation as we set the number of generations as 100.

Variation in the Average Fitness Values from One Generation to Other
We want to confirm the slow convergence. It can be confirmed by measuring the difference in fitness value among two successive generations.

Tabu Ga
The genetic algorithm based selection has been discussed in section 3. The gene structure and the GA parameters are discussed. This section is used to discuss the Tabu GA for the data fusion. The Tabu list for the various components is given in Table 4. The Tabu tenure period is the restriction imposed over the individual components. During this period, the components are neither deleted nor included in the solution. The Tabu list and tenure periods are used as the exploitation tool.

Convergence of Tabu Genetic Algorithm
This section is used to analyse the convergence of Tabu Genetic Algorithm (Tabu GA). The convergence is compared with the odd and even point cross overused in the previous section. The convergence analysis is carried using the following two considerations as mentioned in the last section.

Average Fitness Value
The average fitness values of the individuals present in each generation are recorded, and it is tabulated. The main intention of this work is to analyse the convergence of Tabu Genetic Algorithm. The convergence of Tabu GA is compared with odd and even point crossover's convergence. Table 5 and 6 are compared, and there is a huge difference among the minimum and maximum value. The result shows smooth convergence for Tabu GA against the odd and even point crossover. Again, we want to confirm the results by using the statistical methods.
The difference in the average fitness value for each successive generation is compared using student -t-test. The hypothesis used for the testing is given below: H0: There is no difference between the mean value for two population: µ 1 =µ 2 H1: There is a significant difference among the average value for the two populations: µ 1 ≠µ 2.

Results and Discussions
The experiments are conducted over the same datasets used in section 2. The characteristics of the datasets and  Table 7. Percentage of maximum and minimum difference the components are described in section 2. The results obtained for the Tabu GA is given in the Table 7. The  Table 7 shows a slight improvement in the performance. The improvement in the performance is due to the exploitation of search space. The main intention of this work is to test the convergence of the Tabu GA. The convergence analysis has been carried out in the next section. The average precision value for all the data sets is recorded and plotted. The plotted graph is shown in Figures 1-3. The graph shows the steep fluc-  tuation from one generation to other generation. We have not got the same results for any two successive generations. It is an indicator of the vigorous exploration and slow convergence. The absolute value of the difference between two successive generations is calculated, tabulated and plotted as the bar chart. The chart, which shows  the difference, is given in Figures 4-6. Figures 4-6 show the rapid fluctuation. The step size between the bars is also high. The difference between successive generations is computed as the percentage of deviation. The percentage of deviation is tabulated, and we give the minimum and the maximum value in Table 4. The Figures 1-6 and   Table 4 show the slow convergence of the odd and even point cross over. The convergence is a relative property, and we want to test whether there is improve mention the convergence by merging the best local optima, i.e. Tabu with the GA. The convergence of Tabu GA is analyzed in section 5. The Figures 7-9 show the average fitness value  for the three data set's for 100 generation. The transition from one generation to the other is smooth. We have calculated the difference between the fitness values for each successive generation. The difference in the fit-  ness value is plotted, and it is shown in the Figures10-12. Figures 10-12 show the fluctuation in fitness value from generations one to other. The step size is small, and the percentage of deviation from one to other is calculated and tabled in the following Table 7. The computed t value is given in the Table 8. The NULL hypothesis is success- Figure 11. Difference in fitness value between successive generations over CISI.

Conclusions
The exploitation of search space has been carried out using Tabu GA. At the end of average fitness value of the individual the experiments unconscionable fluctuation from one generation to other generation. We have not to find the same effects for some two successive generations.
Durability is an indicator of the energetic exploration and slow convergence. The convergence of Tabu GA has been tested, and it has been proved, the Tabu GA has smooth convergence over the conventional GA. The experiments establish a relationship between GA and Tabu GA. If the Tabu tenure period is '0' , then Tabu GA becomes conventional GA. If the tenure period is high, then the Tabu GA becomes Tabu Search. Hence, there should be a tradeoff among Tabu tenure value. The impact of Tabu tenure period on Tabu GA is not studied thoroughly and impact of Tabu tenure period over the odd and event point crossover based Tabu GA, we intend to carry out this work in the near future. We want to merge and test other possible local search algorithm with the GA. By doing so, we want to produce new type of hybrid search tool.