An Eﬃcient Velocity Estimation Approach for Face Liveness Detection using Sparse Optical Flow Technique

Objectives: To propose a new liveness detection algorithm using optical ﬂow to ensure the presence of actual live face into a photograph or 2D masks in face recognition biometric security systems. Methods: This work proposes an anti-spooﬁng model namely Sparse Optical Flow Technique with Velocity Estimation Approach (SOFT_VEA). Optical ﬂow is an eﬀective method for tracking objects in motion. It is adapted in this work to capture facial movements and decide the liveness state. The proposed algorithm considers real faces and two kinds of photo imposters. Findings: From the input video, the motion information of speciﬁc facial landmark points is captured by an optical ﬂow algorithm. Then, the velocity of those landmark points is estimated via Euclidean distance. Based on this calculated velocity, the fake face is discriminated from the real face using a threshold value. The Empirical study shows that the proposed face liveness detection model is eﬀective with an accuracy of 88% and Half Total Error Rate (HTER) of 2.45. Novelty: The proposed work is based on real face and photo imposters. The liveness detection algorithm is developed with a novel velocity estimation approach. It is very helpful for biometric security systems.


Introduction
Biometric systems for security have developed rapidly in recent years. Every biometric system has to uniquely identify the individual person's identity based on their psychological or behavioural features. Compared with biometric systems like fingerprint, iris, and voice recognition, face recognition is more convenient and effective for users.
Face recognition is a low-cost and highly used biometric system because it requires a simple hardware device like an optical camera and a low computational algorithm (1) . These makes face recognition a perfect solution for embedded and mobile devices. But face recognition system fails to differentiate whether the person's face is a 'live' face or a 'not live' face. So, the face recognition system needs 'liveness detection' to protect the system against spoofing. Nowadays it is very easy to spoof a person's face using a photograph or 2D masks (2) . To find the spoofed faces various methods are implemented https://www.indjst.org/ using life signs, textures, motions, and so on. This work focuses on the novel algorithm for detecting facial movements with landmark points and proposed velocity calculation.
Liveness detection on face recognition is performed based on facial movements (3) . The input is given in video format i.e. sequence of images, in which the system can easily identify the face regions and recognize their movements such as head, eye, and lip movements to find the fake or spoofed faces. The Optical flow field is one of the trajectory algorithm used to track objects in motion (4) . It is done with two approaches, one is dense and the other is sparse. If all the points of the target are considered for tracking, then it is known as optical flow with the dense model. When selected numbers of key points are used to track the target object, then it is known by the sparse optical flow method. Shin et.al, proposed a non-prior training active feature model (NPT-AFM) for tracking objects or persons by extracting feature points for restoring the occlusion of motion information (5) .
In (6) proposed optical flow of lines to evaluate the motion of the human face for liveness detection. Here, they proposed a quick method for face center detection to classify the typical sources of errors like glasses and facial hair and achieve an error rate of 1%. In (7) introduced optical flow to find the motion vectors used for recognizing emotions. They used Gautama and VanHulle proposed optical flow algorithm for finding the efficient motion vectors. In (8) proposed optical flow of lines to evaluate the motion of the human face for liveness detection. The system utilizes a model-based local Gabor decomposition and SVM experts to select the points to form lines. They found the motion depth of 3D face structure varies from 2D face masks. In (9) experimented using an optical flow algorithm to differentiate the 3D face and 2D face masks. They used optical flow field calculations to distinguish the faces with respect to the threshold values for degree of change. In (10) proposed an optical flow algorithm for debris position and motion detection in image sequences. The parameters used in the method detect in low computational cost than the previous methods. In (11) conducted a challenge and response method for real-time liveness detection for security systems. In (12) proposed an eye and mouth movement combined technique for procuring maximum reliability during face liveness detection. In (13) proposed a hierarchical neural network, which can fuse image quality cues and motion cues for liveness detection. In (14) proposed an efficient algorithm for Movement Estimation and object tracking in video scenes using Optical Flow and Gabor Features Based Contour Model. In (15) delivered a Video processing system that tracks the gauzes on the video captured by the endoscope. In which, the system tracks the deviate in the size of the gauzes along with challenging illuminations, and in the darker background.
In this work, a sparse optical flow-based liveness detection model is proposed. The proposed method analyses the difference in characteristics of real faces and photograph or 2D masks by detecting facial landmarks for each and every person and calculates the velocity for its optical flow field using an efficient velocity estimation approach.

Proposed Work
The proposed face liveness detection model using SOFT_VEA consists of three phases. In the first phase, Facial Landmark Points are marked on the input video frames. Next, the points of interest such as eyes, nose, and mouth points are tracked using the Optical flow algorithm with a sparse approach to finding the motion between consecutive frames in line. Finally, an efficient Velocity estimation approach is utilized to find the impact of sudden and gradual changes in the velocity of landmark points. The impact is measured and using a threshold factor, the fake faces are identified against the real ones. The Proposed block diagram is shown in Figure 1.

Facial Landmark points
For the trajectory operation, video input is converted to a sequence of images and 68 facial landmarks are detected for every frame. To find the facial landmarks, first, the system needs to locate the human face from the input and put a rectangle to it. By using the rectangle coordinates, face landmark points are detected. Figure 2 shows the facial landmark points. Here, from 1 to 68 facial points are detected including the key features like nose, eyes, eyebrows, and mouth.

Sparse Optical Flow Technique
Optical flow is to use the time-domain change and correlation of pixel intensity in image sequences to determine the 'movement' of each pixel's location, that is, to study the relationship between the intensity change by time and the object's structure and movement in the scene. Optical flow can be found by using two types, sparse optical flow, and dense optical flow. Sparse optical flow selects a set of feature points or pixels such as edges and corners to track its motion. While dense optical flow takes every pixel in each frame to track the motion.
Optical flow for the face is irregular with head motion and irregular facial expressions. The relative motion between the camera and object is of 4 types: rotation, transition, moving front and back, swing. All the other kinds of motions are the combination of these four motions. The representation of each motion varies with varying actions. The optical flow for rotation, transition, and moving front and back is similar for both real face and 2D masks. But the optical flow for swing (i.e. shaking, lowering, and raising head) is different for real and 2D faces. Figure 3 shows the optical flow for swing motion.
Taylor series is used to expand the RHS of the above equation (1). By eliminating the common terms, optical flow equation is obtained as, where a and b are unknown variables. Solving one equation with two unknown variables cannot be done directly. So Lucas-Kanade method is introduced to solve this problem.

Lucas-Kanade Algorithm:
Lucas-Kanade method is a kind of sparse optical flow that used a corner detection algorithm to select the corner pixels. Instead of corner pixels, the landmark values are given to the Lucas-Kanade algorithm to find the flow of human facial landmarks. The selected landmark values are then passed to the optical flow functions. Assume that, all the neighbouring pixels to the landmark have similar motions. Consider the LK method taken a 3x3 patch from an image that contains 9-pixel values. So 9 equations need to be solved to determine the two unknown variables a and b.
To solve the above 9 equations, least square fitting needs to be applied. After applying least square fitting algorithm, the equation looks like two equation with two unknowns.
The above equation gives the movement of x and y over time t. Finally, the unknown variables a, and b are calculated by using the above equation. The resultant values from Lucas-Kanade algorithm is used to calculate the optical flow using equation 1.

Velocity Estimation Approach
Tracking of the image sequences gives the velocity. Velocity measures the distance traveled over change in time, so for the input video, the motion of the human face is calculated from the first frame to the last frame using Sparse Landmark Points where the points from Eyes, Nose, and Mouth regions are selected. Euclidean distance is used to measure the distance traveled by the human face from first to last frames. These specific points are useful in finding the 2D imposter faces and fake photo faces effectively. Thus, the velocity of eyes, nose, and mouth regions from consecutive frames are calculated using Euclidean distance as shown in equation 6.

velocity = Euclidean distance total no.o f f rames
The average velocity of selected landmark points against each frame is calculated and plotted as a graph. This graph shows the sudden and gradual changes that occur due to the 'shaking' action. For each and every video sample, the Critical Velocity (CV) value is calculated using the proposed formula as follows, https://www.indjst.org/ mean = max.velocity 2 (8) After finding the CV values, the number of peaks in the graph above the CV value is counted to distinguish the real face and masked faces. If the number of peaks is greater than the threshold (th), then the input video is a real face video. If the number of peaks is lesser than or equal then the input video is masked face video. Here, the threshold is assigned as 2.

Implementation
Proposed work has been implemented by Python 3.6 version along with Anaconda Library. OpenCV (version-4.2.0) library is imported for handling videos and image files. https://www.indjst.org/

Experimental Results
The experiment is conducted with 60 sample videos of three cases. Case 1 is the real faces of 20 different persons. Case 2 is 20 videos of persons with photo imposters or 2D masks. Case 3 is 20 videos of persons using photo imposter or 2D masks with eye and mouth openings. Figure 4 shows the sample databases of photo imposters, photo imposters with eyes and mouth open, and real face videos used in this work for tracking optical flow. For this work, out of 68 landmark points only 41 points i.e. eyes, nose, and mouth landmark points are selected. In Figure 5 , the output screenshots show the proposed optical flow algorithm tracking the facial movements using landmark points on the sample database of all three categories. The calculated velocity using equation 5 of optical flow for each video is represented in terms of a graph with frames as x-axis and velocity as y-axis.
For the real human faces micro facial movements made low intensity value. But, for the 2D masked faces facial movements are purposely made by the humans which cause high intensity values. Velocity increases with increase in intensity, so velocity graph is used to differentiate real faces with 2D masked or fake faces. Figure 6 shows the optical flow graph correspond to the velocity calculated for each frame in the sample video. If the peeks CV are greater than the assigned th value 2, then it is categorised as fake face videos.   From Table 1, it is inferred that for samples of real face input video, the proposed SOFT_VEA optical flow algorithm finds 6 as real faces and one as fake faces out of 7 real faces videos with an accuracy of 88%. In Table 2, the experimental results for 7 https://www.indjst.org/ input videos with photo imposter show that the proposed algorithm finds 5 masked faces as fake faces correctly and 2 masked faces as real faces. So the accuracy for this photo imposter input category is 71%. From Table 3, it is inferred that the proposed optical flow algorithm for inputs of photo imposter with eye and mouth openings finds 6 out of 7 masked faces correctly with an accuracy of 88%.
Half Total Error Rate (HTER) is used to evaluate the detection performances, which is a combination of False Acceptance Rate (FAR) and False Rejection Rate (FRR).
From the above equation, the HTER value for the proposed optical flow algorithm using facial landmark detection is calculated as 2.45. DLTP (Dynamic Local Ternary Pattern) is used to analyse the skin textures for finding out the masked faces among real faces (16) . The optical flow field methods use the degree of change in facial movements using a traditional optical flow algorithm (9) . In this work, instead of corner detection in a traditional optical flow algorithm, facial landmark points are used and its critical velocity is calculated using Euclidean distance. From Table 4, it is inferred that our proposed SOFT_VEA algorithm has higher accuracy and a lower error rate of 2.45 than existing methods.

Conclusion
This work proposed the Sparse Optical Flow Method with Lucas Kanade algorithm using facial landmark detection and determined the real and fake faces using the Velocity Estimation Approach. The proposed model has experimented with three kinds of input videos with persons using real faces, photo imposter, and photo imposter with eye and mouth openings. The performance of the proposed work is evaluated by using accuracy values. The proposed model gives less accuracy for the input of person with photo imposter or 2D masks because it seems to real faces but photo imposter with eye and mouth opening shows the variation in trajectory values for real and masked regions. From the experimental results, it is inferred that the proposed model gives better accuracy of 88% for real face and photo imposter with eye and mouth openings than other state-of-art optical flow methods and lower error rate of 2.45 than existing systems. The future work will consider continuing to optimize the model to further enhance its security performance.