Performance Analysis of Multi-level HAAR in Background Removal for Object Detection

Objective: This study proposes performance improvement in speed of multi-level HAAR processed images for object detection. Method: Background subtraction algorithm is implemented using phase as a feature to reduce illumination variation. The algorithm is implemented on level 2 and level 3 HAAR compressed images. Simulation results are obtained on kit ware database. Findings: Simulation results show that object detection is faster in level 3 HAAR compressed images as compared to level 2 and level 1 HAAR compressed images. Average time required for processing single frame is in range of 6.53 to 29.22 ms in level 3 while that in level 2 is 6.65 to 36.46 ms. Improvement: Using this approach saving of 5% to 22% of processing time is observed at level 2 of HAAR while a saving of 9% to 48% of time is observed at level 3 of HAAR.


Introduction
Over past, numerous approaches have been used to detect the object using compressed videos due to increasing demand of various video processing algorithms implemented on FPGA hardware 1 . This study proposes the object detection on HAAR compressed videos taking in to consideration the memory and the power requirements for implementation. Many features are used for object detection like histogram, edge and corner. Statistical features include intensity, SVM, texture, etc.
One of the intricate task in moving object detection is detection of objects in presence of changing illumination conditions 2,3 . Different approaches of object detection proposed include feature based object detection, template based object detection, etc. Recently many feature extraction methods are developed to overcome the challenges like changing illumination, occlusion, camouflage image etc. faced by background subtraction methods 4 . Proposed approach uses phase as a feature as it is invariant to illumination changes. To extract phase, Gabor filter is used as it is related to invariance to illumination. Feature extraction is performed on multi-level HAAR compressed images, since it is very effective in detecting the exact instants when a signal changes. HAAR wavelets are easy to implement and fast in computation. Results are generated for level1, level 2 and level 3 HAAR compressed images. Speed of operation is compared for three sample files for 45 frames processing showing the enhanced speed in higher levels of HAAR usage in pre-processing.
Organization of the paper is as follows: Related work is surveyed in Section 2, Section 3 briefs the methodology used. Experimental findings and observations are discussed in Section 4. Work is concluded in 5 and Section 6 discusses future work.

Literature Survey
In the study, "Background subtraction based on phase feature and distance transform" suggested detection of moving objects in changing illumination conditions. In 5,6 proposes an algorithm for moving object detection in video which is compressed using wavelet transform and frame difference method in "Moving object detection in wavelet compressed video". In 7 in the study "A Proposed Approach for Image Compression based on Wavelet Transform and Neural Network" proposes the selection of a suitable wavelet function to compress the image efficiently by training back propagation neural network for selection of suitable wavelet function. In but the proposed work did not produce a very high recognition t have reviewed different object tracking methods under different illumination conditions in the review paper "Illumination Condition Effect on Object Tracking". In the study "2-D Object Recognition Approach using Wavelet Transform" used wavelet transform for object detection suitable for greyscale images, however it does not worked well for color images. In the study "Moving Object Detection under Various Illumination Conditions for PTZ Cameras" has tried to overcome the problem in background modelling by adapting a texture based method XCS-LBP and GMM for segmenting the foreground. In 8 study, "Multilevel decomposition Discrete Wavelet Transform for hardware image compression architectures applications" proposes a Discrete Wavelet Transform (DWT) for image compression applications to eliminate redundant information from the transmitted images or video frames over the wireless channel. New Image Compression Algorithm Using HAAR Wavelet Transformation is proposed in "A New Image Compression Algorithm using HAAR Wavelet" by Compression Ratio, PSNR, Threshold Value and Reconstructed Normalization were calculated 9 . Computational time and computational complexity is reduced in Fast HAAR wavelet transform. In 10 suggested a modified HAAR for image compression in "Image Compression using HAAR and Modified HAAR Wavelet Transform". The advantage of HAAR and modified HAAR wavelet transform is the infrequent representation, fast transform and low memory space requirements 11 .

Methodology Used
The work proposed in this study focuses on object detection. It is implemented on enhanced images such that the object detection shows remarkable reduction in execution time. Proposed work cites the reference of object detection in which object detection is done under complex environment. In 12,13 Phase feature approach. which is suitable for background modelling is used. This is done by taking every pixel created as a group of adaptive phase features. Distance transform is used to get clear results by accumulating the neighboring foreground pixels. Basic methodology consists of following steps: • Construct the Gabor filter, 2D Gabor filter is a Gaussian kernel function modulated by sinusoidal plane wave: is the orientation and varies from 0 to max1 v varies from 0 to max -1. σ is the Gaussian width to wavelength.

= I(z)
is a convolution operator and G(z) is a outcome of convolution expressed as amplitude part and phase part. Gabor filter is constructed with 4 frequencies and 6 orientations which means v max = 4 and φ max=6. υ max and φ max respectively depict the number of frequencies and orientations. Frequency of the Gabor wavelet is calculated as per the formula; Feature U is computed and compared with present set of k models to check for matching models for every new pixel. Matching condition is defined as the feature value within the standard deviation of the distribution.
• When the new phase feature and the mean value of the background model both fall into this singular area, the distance between them is computed as 2 Lᴫ -Dist, where Dist is the absolute value between the two values. • If such models exist, the first matching one is selected for updating its parameters and the model weights are updated as:

= +
Where α 1 € (0,1) is the Ist learning rate and for the first matching Gaussian M k =1 and is zero for others 4 .
• Blob aggregation is performed using distance transform and morphological operations is used 1 . • In order to make reduction in the processing time of the frames we involve the HAAR transform for multiple levels (1, 2 and 3). Each frame is transformed using HAAR operator before applying the above mentioned Gabor filter based phase computation. • The HAAR transform reduce the area of the frame by one fourth in every level, thereby permitting a very easy mapping of threshold for background foreground discrimination of the objects.

Experimental Results
The entire process is implemented using MATLAB running on a 64 bit windows 10 system.. The code records the processing time for each of the frames at multiple HAAR transform level using the tic and toc commands of MATLAB on a frame by frame basis. The process is designed with the intuitive approach that as the HAAR transform performs the compression of low-spatial frequency information into the top left corner quadrant of the frame and only high spatial frequency information into the rest 3 quadrants, the top left image provides a compressed frame with similar information as the original one. As the level of HAAR transform is increased the frame keeps on reducing by one fourth in area reducing the pixel count to be processed by the phase calculation module and then the actual foreground background identification. This leads to a reduced process time if a higher level of HAAR is used before the phase processing.
We observe that the process shows some jumps in the timing for the frame processing which is the result of the impact of the memory allocation done by the operating system for all the programs as per the program demands as well as background process. It is observed that the initial frames are more susceptible to this kind of noise and hence we had used the values after first 80 frames for comparative study. Starting from this frame 80 a span of 45 frames till frame 125 was used for multiple files for the study.
The graph now is seen to be less noisy as compared to the previous ones and also clearly indicates that even though the process itself has interference from OS and other programs still the higher level provides better processing time for the frame. Table 1 shows the timings obtained for the three files used for evaluation of the scheme.
The Average time comparison is shown in Table 2, it is observed that as the level is increased the average processing time for the frame reduces. Figure 1 presents the comparative graphs for the average time, it is interesting to note that there is a very wide variation in the average time for each of the files, but as the HAAR level is increased there is reduction. This indicates that the video contents do dominate the process in terms of actual time Figures 2 to 6.
The time saved is presented in the Table 3.  The output for various cases are presented below for a sample video is presented.

Conclusion
Despite the fact that there is no uniformity in the processing times of the various videos it is observed that, a pre-processing step applied to the phase based foreground background separation process helps in reducing time. A reduction of 5% to 2 % of processing time is observed at level 2 of HAAR while a saving of 9 to 48% of time is observed for level 3 HAAR. This enhances the speed of object detection in changing illumination conditions.

Future Work
Apart from involving other transforms in the spatial domain like wavelets, the work can be expanded by involving use of HOG features as a step for pre-processing prior to the phase based foreground background detection. The design can be accommodated on any targeting FPGA device to design a reconfigurable processor.