Role Of Pattern Characteristics In Cross Correlation Based Motion Estimation

Objectives: To establish a pattern tracking based motion estimation algorithm for the stereovision based system and to investigate the eﬀect of threshold value (Thv), size and population of patterns on the pattern tracking value. Methods: Proposed motion estimation algorithm correlates the set of motion frames captured from two high speed cameras conﬁgured in stereovision system. The correlation scheme was based on grayscale pattern tracking in moving frames. Pattern development and correlation algorithms were developed. A spherical object was given small random displacements and the motion was captured using stereovision system. The eﬀectiveness of algorithm is evaluated with the pattern tracking value which should be close to 1 for perfect pattern match. Findings: The correlation results indicate that pattern tracking value were found to be 0.920 and 0.899 for left and right cameras respectively when the threshold value (>10) and size 10 pixels are considered with high population (200 patterns). With the increase in pattern size, the pattern tracking value decreased. Another study revealed that pattern tracking value was comparatively higher when the pattern population was maximum (200 patterns). The pattern tracking value again decreased when the threshold value (>15) is considered. It was concluded that pattern size of 10 pixels with threshold value (>10) is more pronounced for motion estimation. The proposed algorithm is veriﬁed for the 3D displacements of a rectangular plate mounted on XYZ translation stage. The pattern tracking values were 0.97, 0.95, 0.96 and 0.96, 0.94, 0.95 for X, Y and Z displacements respectively. The correlation algorithm is also coupled with the compression technique using wavelet based data compression. Novelty/Applications: The proposed algorithm can be eﬃciently applied for both in plane and out of plane motion estimation. The algorithm can provide constructive outcomes for small motion prediction with proper selection of pattern size and threshold value. which were not yet explored in previous works. The proposed algorithm when coupled with image compression algorithm produces results comparable with uncompressed data set. The study can further be explored for other pattern shapes which are crucial for objects with complicated edges and surfaces. The cross-correlation scheme eliminates error due to pattern mismatch or overlap during correlation. It also eliminates matching of two distinct patterns which have similar threshold value.


Introduction
Image processing is an effective and emerging tool in motion assessment and evaluation for various applications such as traffic control, surveillance, precision agriculture, military and defense etc. Most of the motion estimation techniques involve computer vision based algorithms. These techniques are based on image registration, which stores the motion features and tracks them in subsequent images. The feature tracking (1) investigates image features in motion frames using the geometric transformations equations (2) . The transformation equations require a set of images that cumulatively represent the motion. The conventional technique to estimate the motion of the object was initially introduced by Horn and Schunck (3) but the discontinuity in the motion violates their assumptions (4,5) . The Lucas Kanade based algorithm computes the spatiotemporal derivatives but without pyramids it fails in the areas of large motion (6) . The original Lucas Kanade algorithm also suffers in the huge computational cost and also the difference in shading and light intensity made difficult to analyze the motion (7) . One of the simplest ways of tracking the object motion is block based motion estimation, but the technique involves the inclusion of robust estimators. In many cases the calculated results are interfered by the surroundings. Several techniques are available but they are complex (7) and require a series of steps for motion tracking of the object. Most techniques reported in previous studies were effective for motion assessment when the object displacements are large while the investigation of small scale motions with pixel level accuracy is still an ongoing research. On the other hand, the digital image is suffered from variety of distortions and other noise effects (8) which should be incorporated in the motion estimation algorithm.
When the small scale motions are to be quantified, image quality and quantity both play crucial roles for efficient motion estimation (9) . Static motion can be estimated with a couple of images whereas dynamic or frequent motion requires multiple images such that the motion information is not lost in between images. Non-contact methods (10) involving high-speed cameras (11,12) , coupled with fixed focal length lenses are preferred for static and dynamic motions. High speed cameras offer promising motion data with pixel level resolution. High-speed imaging enables the prediction of micro-scale motion and hence they are effective in both static and dynamic motion estimation. Further technological advancements in high-resolution and high-speed cameras enable the tracking of both two and three-dimensional (13) object at variable orientations. The image acquisition speed in high speed cameras is adjusted on the basis of required motion characteristics. Studies have been reported where a stereo-rig consisted of two camera system was employed for motion measurement (10,14) . Mere selection of high speed imaging systems will be inadequate for accurate motion characteristics unless the image quality is considered. In case of small scale motion, grayscale patterns or features are first extracted from the reference image and then correlated in the motion frames. The patterns should be produced from the entire field of view or the area of motion interest.
The pattern based feature tracking algorithms have gained enormous attention in last decade (7) . The feature matching algorithms tracks small identical subsets also known as patterns in the series of motion sequences. These patterns are of few pixel size, hence the object under investigation should be in proper focus before image acquisition since blurred feature areas can reduce the effectiveness of matching algorithm. For out of plane motion cases, high depth of field lenses should be preferred with high-speed cameras since motion information are difficult to retrieve from conventional lenses. It could be concluded that pattern characteristics are essential to address when the image processing based techniques are concerned and should be addressed with a precise motion estimation algorithm. Further advancement in image processing based techniques were reported with the correlation algorithms. Image correlation has been emerged as a powerful tool due to its accuracy and high precision (15)(16)(17) . Pattern based image correlation identifies the location of patterns created from a reference image in the motion frames (18) . The patterns are used to correlate the image sequences captured from multiple cameras. A two-camera system constitute stereovision configuration, which is sufficient for in-plane and out of plane motion estimation. The reference image is partitioned into patterns (19) of various sizes depending on the object under consideration.
The images are captured using different viewpoints depending on the object and scene. Although pattern correlation based motion estimation algorithm are pronounced but coarse pattern features can reduce their effectiveness (20) . Various pattern features such as size, population and grayscale intensity are the inputs of correlation algorithms and should be selected precisely. Apart from pattern features, brightness (21) also plays an important part in tracking (22) . When a pattern (23) tracks its location in the nearby frames, its intensity may vary depending upon the lighting effect. Hence, effective measures are taken for brightness constancy (24,25) . Figure 1 Shows the error sources in pattern recognition based motion estimation for small and large subsets. For the matching of various subsets in their appropriate location subset size and thresholding value plays a crucial role. Figure 1(a) shows the small subset with multiple match and Figure 1(b) shows the match overlap of the large subsets. To avoid errors in (a) and (b), cross-correlation is performed that match similarity between pattern and base image. An efficient cross correlation algorithm can be implemented with optimized process parameters such as pattern size, grayscale intensity and pattern population. Hence, the aim of present work is to study the effect of pattern size, pattern population and thresholding value on cross correlation based motion estimation. This paper presents an image correlation based motion estimation algorithm. The algorithm (11) calls patterns developed from base image and these patterns track their location in https://www.indjst.org/ consecutive frames. We configured a stereovision camera system for motion acquisition that can be used for both in-plane and out of plane motion estimation. The tracking is performed by estimating the pixel level information. The MATLAB codes are developed for the complete algorithm. The effectiveness of the proposed algorithm is determined by the pattern tracking value. The emergence of image correlation lies in the tracking of the patterns. The patterns are formed with certain pattern size and thresholding value. Several transformations are involved in relating the world coordinate system to sensor coordinate system (10,26) . Several other methods are also available for tracking. Single pixel imaging is a sequential technique, requires object to be static until signals from all patterns have to be collected. Monin et al. (27) modified the single pixel algorithm and studied with moving object frames for piece-wise 2D linear motion in single pixel imaging, when an object moves within image of plane during data acquisition, the root mean square error calculated was 0.22 with no regularization. However, the analysis was limited to planar movement only. The technique leads to large reconstruction complexity with non-cyclic basis. The local motion algorithm is limited with static background. The present study reports the high resolution motion measurement technique and its implementation on motion in the range of micro-scale to meso-scale to macro-scale. To compress the pixel level motion information in 3D the compression algorithm is coupled and the pattern tracking parameter is calculated for both compressed and uncompressed image frames and the root mean square error calculated is 0.18. Wavelet transform emerged as a powerful tool in multi-resolution analysis of an image (28,29) . Orthogonality results in complicated design equations (30) . Utilization of symmetric filters is good for the representation of an image (31) . This paper describes the tracking of the object that describes the contour, pattern region of interest and location of those pattern in nearby frames, by determining its pattern tracking parameter. The study is extended for compressed image frames. The image correlation algorithm is coupled with compression algorithm.
Present study also includes the cumulative effect of various motion estimation parameters such as pattern size, pattern population and threshold intensity, which were not yet explored in previous works. The proposed algorithm when coupled with image compression algorithm produces results comparable with uncompressed data set. The study can further be explored for other pattern shapes which are crucial for objects with complicated edges and surfaces. The cross-correlation scheme eliminates error due to pattern mismatch or overlap during correlation. It also eliminates matching of two distinct patterns which have similar threshold value. https://www.indjst.org/

Cross-Correlation in Pattern Tracking
Image correlation based motion estimation algorithm involves the feature matching criteria. The algorithm tracks the gray scale intensity of patterns in the image sequences. The grayscale format of images is preferred for correlation which can be produced by speckle patterns over the object surface. The next step involves motion acquisition in image sequences. In image correlation, three transformations are involved relating the world coordinate system to camera coordinate system, then camera coordinate system to image coordinate system and finally image coordinate system to sensor coordinate system (10) . The algorithm for the proposed method involves system calibration (32) and correlation. To obtain the displacement of any point, the world coordinate system (x w , y w , z w ), sensor coordinate system ( x p , y p ) and camera coordinate system (x, y, Z) are related using equation (1) and equation (2) (10) . ( and [T] are the extrinsic parameters represent rotation and translation parameters. These extrinsic parameters are obtained to determine the position and orientation of camera 2 relative to camera 1. These parameters are calculated from calibration (33) and are needed to obtain the displacement of any point in space. Two cameras are employed, where j ε N, (N = 1, 2). A checkerboard plate is used preferred for calibration. A pair of 20-30 images with translation and rotation can provide good calibration data. The camera calibrator app in MATLAB is used for the said purpose. The cameras are placed at desired angle (14) such that the object is in focus. Calibration of camera system should be performed with same camera settings and orientation adopted during calibration. To avoid correlation errors such as multiple matches of templates, cross-correlation is preferred.
 Cross-correlation based image correlation can estimate the motion of the object occurring naturally or on an applied test surface by tracing the motion of templates. Any image can be read as a matrix, where each pixel has some intensity value. The pattern tracking value is calculated as a measure of perfectness of match. For accurate matching, the pattern tracking value comes out to unity. For m observations of variables x and y, the correlation can be expressed as in equation (3), where x' and y' are the mean values.
https://www.indjst.org/ If I (x, y) is the image intensity at location x and y and f (x, y) is the feature template. The similarity between them is calculated by cross-correlation. Where x ′′ and y ′′ are the pixel displacements as shown in equation (4).
The cross correlation can be improved by optimizing the pattern characteristics. Hence, present study is aimed towards optimizing pattern size and thresholding value. The effect of above characteristics is studied using pattern tracking value. Figure 3 represents the correlation algorithm developed in present study. It includes setup calibration followed by image correlation. The calibration process produces the intrinsic and extrinsic parameters required in the correlation algorithm. As indicated we first configured the two-camera system. The angle between cameras was set 34.12 o and also verified from the extrinsic parameters obtained after calibration. We recommend the use of same camera settings as used in the calibration process to avoid correlation errors due to mismatch in intrinsic parameters such as focal length and principal point. The calibration grid is kept in the same region where the object under consideration is placed. Next step is the correlation process. The grayscale speckling makes the correlation process effective since the algorithm tracks the grayscale intensity of templates in motion frames. The motion of object is captured using camera system and the image sequences are obtained. We developed separate codes for pattern development and correlation. A reference image (preferably first image of left camera) is selected for pattern development. We created patterns of size 10 and 20 pixels in present study. The set of motion images are imported in our correlation code written in MATLAB environment. The patterns and the motion frames are called in the correlation script. The algorithm then correlates these patterns with different thresholding values in the motion frames and generates the pattern tracking value. For the motion information into the image plane, the perspective projection parameters are needed. A set of 100 images are captured for each considered case. We observed the correlation results with varying pattern population from 25 to 200 patterns. The pattern size is essential factor since these patterns carry the actual motion information content. The experiments are performed for two different pattern sizes. Based on the object motion and object focus, the camera speed is selected. A thresholding value for tracking the pixel intensities is decided for correlation. Figure 2 shows the motion of the pixel in the reference and the deformed image. Thus, we report our algorithm with the pattern size, pattern density and thresholding value effects on motion estimation. https://www.indjst.org/

Camera Calibration
I-speed TR cameras (10000 FPS) and resolution (1280 x 1024 square pixels) with Nikon 50mm, 1:1.4D lenses are used in the present study. The calibration and correlation experiments are performed at same camera speed of 1500 fps. The camera speed is motion specific. The selection of camera speed is an important factor in two ways. It helps in synchronizing with the frequency of moving object so that the required motion information is obtained. On the other hand, high camera speed helps in acquiring multiple frames for even small displacement. This makes the correlation of patterns relatively easy in finding their respective location in motion frames. We captured 100 frames for each case with this setting. Camera input settings with sufficient illumination ensured the focus on the object. A checkerboard pattern grid was kept in the common field of view of both the cameras. The size of the checkerboard grid was 12 mm. We gave random small translations and rotations to the grid for calibration. The idea was to verify the setup for both in-plane and out of plane measurements by obtaining extrinsic parameters. However, for only in-plane motion estimation, a single camera is sufficient and mere in-plane translation can produce calibration parameters. The principal points obtained after calibration were (639.432, 511.221) and (638.245, 511.008) for camera-1 and camera-2 respectively, which were close to ideal principal point (640, 512) and indicate accuracy of calibration. Table 1 shows the stereo camera calibration parameters for camera 1 and camera 2. The position and orientation of camera 2 relative to camera 1 is determined by its rotation and translation (10) .

Validation of Algorithm for x, y and z Translation
Prior to motion measurements, the proposed algorithm was validated for the 3D displacement of a XYZ stage traverse. We created a small rectangular plate with the patterns. The translation is performed in X, Y and Z directions with the help of a screw-driven XYZ translation stage with measurement marking. Figure 4 shows the translated plate with 5 patterns and the XYZ stage traverse used. The translation given to the plate was also captured using the camera system. The plate is translated by unknown distance of 5 mm in three mutually perpendicular directions. The least count of the traverse used during calibration was 0.01 mm. The patterns created on the plate were extracted using the pattern making code. All these patterns were used in the correlation algorithm as an input before calling them in the correlation algorithm. The threshold intensity was set to Thv > 10. We implemented the correlation algorithm in the sequence of images obtained during displacement. The location of all the five patterns was obtained from algorithm. The results from proposed algorithm were compared for each pattern as shown below. Tables 2, 3 and 4 shows the translation in X, Y and Z direction for different patterns. To further observe the effectiveness of correlation, we also obtained the pattern tracking value for every pattern. The pattern tracking value for the XYZ displacement was found in the range of 0.94-0.97, which indicates the effectiveness of the correlation algorithm for small displacements.  Figure 5 represents a spherical 3D object considered for motion estimation. It is evident that the object was in proper focus in both the cameras. A spray gun was used to apply speckle pattern over object surface. This is an iterative process and requires precision and control. The patterns were extracted from the left camera image. We gave small out of plane displacement to the object that includes all three translations using traverse. Figure 6 represents six motion images out of 100 frames that constitute complete motion. Left and right images are captured from left and right cameras respectively. It is evident from these figures that the cameras were effectively aligned prior to image acquisition. We captured the motion using camera system with 100 frame settings. MATLAB code was developed for cropping the undesired object area, pattern generation, for tracking the pattern of the subsequent frames. Pattern sizes of 10 and 20 are taken in the pattern making process. We studied the correlation in terms of pattern tracking value with the pattern population of 25 to 200. The correlation results were also analyzed for threshold intensity greater than 10 and 15.

Coupling of Image Correlation Algorithm with Compression Technique
The technique is extended for compressed images to minimize the storage space occupied by the pixel level motion information. The image correlation algorithm is coupled with compression technique using wavelet bior 5.5 and 6.8. The image frames are compressed and pattern tracking value is calculated for both compressed and uncompressed image frames. Figure 7 represents the compressed first image frame from left and right camera.  Hence, it could be concluded that correlation effectiveness can be improved with population density with small patterns. To study the effect of threshold intensity, we further implemented the correlation algorithm with threshold value (Thv) greater than 15. The evidences of increase in tracking value with pattern population were found again. However, the tracking value was decreased for both small and large patterns compared to the data presented for threshold value (Thv) greater than 10. The increment in pattern tracking for this pattern size is 6.4% for 25 patterns, 9.5% for 50 patterns, 11.0% for 100 patterns, 12.2% for 150 patterns and 13.1% for 200 patterns, for left camera. Similarly, for right camera 2.4% for 25 patterns, 4.6% for 50 patterns, 6.9% for 100 patterns, 8.2% for 150 patterns and 11.2% for 200 patterns. Table 6 shows the pattern tracking for thresholding of greater than 15. Few important points are noticed: Pattern tracking value increases when thresholding is greater than 10 a) this says that the patterns with Thv > 10, plays a vital role in tracking and hence it could be concluded that b) patterns with full gray-scale range should be developed which could be accomplished by uniform speckling. If we compare pattern tracking value for 200 patterns, the percentage reduction in pattern tracking is 13.15% for pattern size of 10 for left camera and 13.35% for right camera. Pattern tracking value increases with pattern density a) because more pattern cover maximum object tracking area. b) The pattern tracking values from both cameras are different but the percentage increment in pattern tracking is almost identical. This also verifies the algorithm. The difference in pattern tracking value in cameras correspond to variation in intrinsic camera calibration parameters. Pattern tracking value decreases with increase in pattern size The pattern size is decided by the area of interest of object. Increase in pattern size corresponds to drop in pattern tracking values. This could be due to a) mismatch of patterns due to same average intensity of patterns b) the small pattern value can also give multiple match, that's why we use cross correlation. The pattern tracking value saturates beyond 200 patterns. Table 7 shows the pattern tracking value for uncompressed and compressed image frames with bior 6.8 and bior 5.5 at thresholding of > 10. The result shows that the compressed image frames with bior 6.8 predicts the motion information equivalent to uncompressed image frames. The pattern tracking value for camera 1 and camera 2 with bior 6.8 compressed image frames is 0.901 and 0.852 which is almost similar with the uncompressed frames pixel level motion information. Compression https://www.indjst.org/ requires linear phase to avoid unexpected distortions in the compressed image. So, the wavelets such as Bior with linear phase can be adopted at thresholding of >10. Table 8 shows the comparison of root-mean-square error.

Conclusion
The aim of the present study was to estimate the motion of the 3D object using a non-contact method. High speed cameras are used in the experiment. Calibration of the cameras is done and the intrinsic and extrinsic parameters are used in the correlation process. MATLAB code for pattern formation and pattern tracking was developed. The pattern tracking was done at different thresholding values by varying the pattern size and number of patterns. It has been concluded from the experiment that increasing the number of patterns increases the pattern tracking value and a perfect correlation is achieved with more number of patterns at a pattern size of 10 that covers all the intensity values greater than 10. It was also concluded that the proposed algorithm can be applied for motion estimation of any object, irrespective of its shape. The increment in pattern https://www.indjst.org/ tracking value goes up to 0.920 and 0.899 for left and right camera at a thresholding value of greater than 10 with pattern size of 10. The pattern tracking value was found to increase with the increase in pattern number depending upon the object and it saturates around 150 to 200 patterns for pattern size of 10 and thresholding value of greater than 10. The coupling of image correlation algorithm with compression technique reduces the storage of image sequences. The pixel level motion information for compressed images are compared with the uncompressed image frames. The pattern tracking value calculated for bior 6.8 compressed image frames are 0.901 and 0.852 for left and right camera that is equivalent to the uncompressed frames. The proposed technique can efficiently be applied for motion tracking of sensitive objects where mounting any physical sensor is complicated. The square patterns are developed in pattern-making process. This can be optimized to produce multiple shapes of patterns for correlation and to eliminate excessive areas, which are out of motion interest.