An Evaluation of the Impact of Multicollinearity on the Performance of Various Robust Regression Methods

Objectives: To examine the performance of several regression methods comprising of Ordinary Least Square (OLS), and certain robust methods including; M-regression, Least Median of Squared (LMS), Least Trimmed Square (LTS), MMestimation and S-estimation, under fluctuating levels of collinearity, using the criterion, Total Absolute Deviation (TAB) and Total Mean Square Error (TMSE) with some graphical tools. Methods/Statistical Analysis: Robust Regression methods insure good performance even in case the fundamental assumption of normality is not satisfied. The presence of multicollinearity affects the results of robust regression methods and marks them unsatisfactory. A quantitative evaluation of these techniques is provided by using the criterion, TAB and TMSE. Results are summarised by using box plot of absolute bias, along with the graphs of TAB, and TMSE. Findings: The results show that for minor levels of collinearity the effect is low and similar, but at greater levels of collinearity the effect is high and performance wise all the methods give quite incompatible results. It is also illustrated that greater magnitude of collinearity along with higher percentages of outliers ranks the underlying methods quite differently, resulting in MM-estimation method to be the most unpleasant. Conclusion: While applying any statistical method it is necessary to consider all the assumption underlying that method as well as every aspect of our data to avoid misleading results. It is illustrated that MM-estimation method although a best candidate for higher percentages of outliers alone, become the most unpleasant, by a simultaneous interruption of high level of collinearity, hence robust ridge techniques need to be adopted.


Introduction
Regression analysis deals to model the relationship between variables approximated by some appropriate mathematical Eq. In case the regression model satisfies certain basic assumptions, the OLS estimators happen to be best linear unbiased estimates 1,2 . The estimate is too much sensitive to violation of these assumptions and even a single contaminated observation can result in the OLS estimator to be unreliable. Researchers have been attempting for alternative estimating procedures, known as robust regression methods which are robust to outliers. decreases reliability of estimates, potentially affecting estimation, forecasting and hypothesis testing. Robust regression being a good alternative in case of violation from normality, but are suspected to lose performance when the problem of non-normality is joined by multicollinearity concurrently. In this paper efforts are carried out to evaluate the performance of various methods (OLS, M, LTS, LMS, S, and MM) under various simulation settings, specifically investigating the influence of collinearity levels on the performance of these methods.

Ordinary Least Squares Method
Least square method is generally used technique to estimate the parameters in the model. In this technique, estimates of the parameters are obtained by principle which minimizes the totality of squared residuals. In case the linear regression model accomplishes the basic assumptions, OLS estimators stay best linear unbiased 1 . This method provides an explicit estimate of the true values from observed data as: The logic behind frequent use of this method is its computational easiness, but unfortunately this method depends upon a controlled set of assumptions, now being criticized to a greater extent for lacking robustness 3 .

M-estimation Method
M-estimation being a common robust regression procedure was primarily introduced by 2 . This technique in a sense is a general form of the least squares substituting the quadratic loss function by function ρ .
( ) is symmetric, continuous having a unique minimum at zero 4,11 . The function ρ .
( ) may be chosen in such a way that it denotes some weighting scheme of the residuals. The set of normal Eqs to be solved is given by the system: Where ψ e ( ) = , the weight function's being an estimate of the residuals scale. The choice of ψ to be monotone will not weight discrepant values as much as the least squares, whereas a re-descending ψ function results in a weighting scheme that assign the weights in decreasing order up to a definite distance (e.g. 3σ) and then declines the weight to zero as the remote distance is increased much. Some of the proposals for ρ .
( ) ψ and the weight function are given in Table 1.

Least Median of Squares (LMS) Regression
The LMS regression was suggested by 3 , using the idea of minimization of the median of the squared residuals not the sum of the squared residuals. Under this procedure the estimates for the model parameters are provided by the Eq: LMS regression estimator attaining a high breakdown of almost 0.5 is the first Equivariant estimator. Although the (LMS) estimator is robust to outliers in y-direction as well as in x-space, it has a drawback that the efficiency of this methodises quite low as compared to the least squares in the instance of Gaussian errors. Due to this deficiency LMS estimator has a very little direct use, but is often used as an initial estimator for diagnostics purposes or in some other robust techniques 5 .

LTS Regression
The LTS regression method is an alternative robust regression technique suggested by Rousseeuw 3 . To elude outliers, this procedure minimizes the totality of squared residuals after the largest α squared residuals are trimmed. The LTS regression estimator is given by: , this method achieves a breakdown point, (n/2p + 2)/n. A difficulty of the LTS method is the action rEquired for sorting the squared residuals in its objective function 5 . Several algorithms suggested in literature for this approach are simulated annealing based LTS-algorithm developed by 14 , 'Feasible Set Algorithm' by 12 . Another algorithm called FAST-LTS, has been given by 13 , which is too fast than all the existing algorithms. The high statistical efficiency and faster rate of convergence of the LTS over the LMS make LTS a more appropriate nominee than LMS as an initial step for two-stage estimators such as MM-estimator and the generalized M-estimators 13,15,16 . In 17 an L-1 penalty is imposed on LTS estimator a spars estimator has been developed.

S-Estimator (S-Regression)
The S-estimator an alternative estimator possessing a high breakdown is suggested 3 . The S-estimator minimizes an M-estimate of the residuals scale. This method estimates the true values as: A dominant choice of ρ is: The S-estimator may possess a high breakdown value of 50%, if K in Eq (6) and ρ in Eq (8) satisfies: K c S-estimator possesses the properties of high breakdown and asymptotic normality. The compromise between breakdown point and efficiency is determined by choice of the tuning constants and K. In 18 it is concluded that the S-estimator under Gaussian errors can achieve an efficiency of 0.33 with a breakdown of 50% has suggested fast-S, an approximating algorithm for obtaining S-estimator of regression 19 .

MM-Estimation
The MM-estimation recommended in 4 , a special brand of M-estimation is a multistage estimator joining the high breakdown from an initial robust estimator and high efficiency from another, bringing about high breakdown and good efficiency from another robust estimator. The calculation of MM-estimator contain, considering a high breakdown point initial estimator, compute an M-estimate of the residuals scale and obtaining an M-estimate of the true values on the basis of the M-estimate of residuals scale . The algorithm for MM-estimation procedure may be given by; 3. The third step use an M-estimate of coefficient β initial fromstep first and estimate of residuals scale S m from the second step, gives β final as a solution to: For a particular value of c 0 , the tuning constant.
The MM-estimator enjoys high efficiency, a high breakdown value of (50%), unluckily may be influenced by the occurrence of high leverage observations [20][21][22] . In 23 a robust version of ridge estimator referred to as Weighted Ridge MM-estimator (WRMM) is offered by means of weighted ridge and the MM-estimation penalized MM-estimation called (MM-lasso) by using the L-1 penalty and the mechanism of MM-estimation 24 .

Simulation Studies
To compare the performance of different methods, numerous simulation options are examined. A simulation structure is implemented to allow on-normality and multicollinearity together. The particulars of various aspects considered in various settings of simulation are: Methods evaluated: Methods assessed under various simulation settings include, OLS, M-estimation, LMS, LTS,S and MM.
Sample size: In different simulation settings we have considered, the sample size at 50,100, 150 and 200.
Number of predictor variables: for fitting models using different multicollinearity levels and fractions of outliers, the number of predictors (P) is considered at 2, 4 and 6.
Fractions of outliers: In various settings of simulation, particularly focus is on y-outliers. While judging the performance of different methods, numerous fractions of outliers; particularly 10%, 20%, 30%, and 40% outliers are generated in data sets.   To let different levels of collinearity, the values on predictors are generated by means of a methodology used in [25][26][27][28] . To generate explanatory variables the following mechanism is used: z ij , is a standard normal variate and ρ is specified in such a way that the correlation coefficient between any two predictor variables is maintained at ρ 2 . The scatterplot matrices of predictors X 1 , X 2 , X 3 andX 4 generated by system in Eq (12) with different values of ρ are given in Figure 1-2 and correlation matrix in Table 2 For error distribution considered in Case I to Case V, atn = 50, 100, 150, 200 and P = 2, 4, 6, though the values of TMSE vary, increases with an increase in the value of P and decreases as the sample size grow. With respect to performance for all the methods nearly similar pattern is observed for each value of n(50,100,150,200) and P (2,4,6). It is observed that for lower collinearity levels, the values of TMSE for all techniques are not much divergent, but at greater collinearity levels, the TMSE values for all the methods are pretty different. The graphs of TMSE for (OLS, M,LTS, LMS, MM and S) for different collinearity levels and outliers percentages for n = 200 are given in Figure 3. Figure 3 reveals that at smaller levels of multi-collinearity with increasing percentages of outliers causes the (TMSE) to increase with little differences but the (TMSE) values at higher levels of multi-collinearity increased markedly with wider differences and give different ranking of the methods. In Figure 4, a graphical analysis for (OLS, M, LMS, LTS,S and MM) is given for the second performance measure total absolute bias. The results from Figure 4 are consistent with the results in Figure 3. The results over the two performance measures, TMSE and TAB for different scenarios are given in Table 3 and 4, respectively. In Figure 5, a graphical analysis of box plots for various situations is also given.

Conclusions
In this study efforts are carried out to compare the performance of different regression methods under the influence of varying levels of multicollinearity. A case wise discussion over performance of different methods is as under: Case I: ~ N (0, 1): For the error distribution considered standard normal, at lower collinearity levels all the methods performance is very similar but as the level of collinearity grows they behave quite differently. Moreover it is evident that at higher collinearity levels LTS, LMS seem the poor, S the next whereas OLS, M and MM appear to perform reasonably fine.
Case II: ~ 0.9N (0, 1) + 0.1N (10,1): In this case the error distribution is considered in this case allow 10% outliers in y-direction. In this case at lower levels of collinearity OLS appear to be markedly different, while the other methods behave fairly alike. At higher collinearity levels and 10% fraction of outliers OLS appear the worse whereas (LTS, LMS and S) the next and (M, MM) perform sensibly well with MM the finest of all.
Case III: ~ 0.8N (0, 1) + 0.2N (10,1): For a fraction of 20% outliers OLS is markedly different at all levels while the remaining all methods perform nearly similar at lower collinearity levels but behave very differently for greater values of collinearity levels. In this case the ranking of the methods nearly similar to that in Case II.
Case IV: ~ 0.7N (0, 1) + 0.3N (10,1): In this case at lower levels of collinearity there seem to be two categories (M-Estimation and OLS) and (LTS, LMS, S and MM), together all the methods have quite diverse behavior at higher collinearity levels. Particularly at higher levels of collinearity MM is the best with (LTS,LMS, and S) the succeeding set of best, whereas OLS and M perfomance is very poor and M-Estimation give the worse result.
Case V: ~ 0.6N (0, 1) + 0.4N (10,1): In this case 40% outliers all together with lower to moderate collinearity levels OLS and M method results in higher values of (TMSE), MM the next method giving subsEquent higher values of (TMSE), the other methods (LTS, LMS and S) havinglow and nearly similar values. However 40% outliers considered with a high multi-collinearity level, ranks the methods quite differently. The MM method which is unsurpassed one in all cases turn out to worse among all, M and OLS forming the second set resulting in higher (TMSE) values, and the other three methods (LTS, LMS and S) generating relatively small values of (TMSE).