A Novel Zernike Moment-Based Real-Time Head Pose and Gaze Estimation Framework for Accuracy-Sensitive Applications
2. Related Work
2.1. Appearance and Template-Based Methods
2.2. Geometric and Model-Based Methods
2.3. Deep Learning-Based Methods
3.1. Preprocessing and Eye Pair Detection
3.2. Feature Extraction
3.2.1. Zernike Moments Feature Computation
3.2.2. Zernike Moments Calculation
- Take an entire image I(p,q) of size 256 × 256 as input.
- Determine the center of the image.
- Create a square window of minimum size (100 × 100 for gaze estimation, 64 × 64 for pose estimation) to focus the head region without distortion.
- Place a square window in the original input image so that the image center should be the center of the square window.
- Determine the order (n, m values) by satisfying the condition in Equation (5).
- Calculate ZM basis function to generate the polynomials by using Equations (4) and (6).
- Calculate complex Zernike moments from the image as Equation (7) using the basis function.
- Reconstruct the shape using the basis function and complex Zernike moments.
3.3. Dimensionality Reduction
3.4. Head Pose and Gaze Direction Estimation
4. Results and Discussion
4.1. Data Collection
4.2. Performance Evaluation for Zernike Moments Feature Extraction
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
- Krinidis, M.; Nikolaidis, N.; Pitas, I. 3-D Head Pose Estimation in Monocular Video Sequences Using Deformable Surfaces And Radial Basis Functions. IEEE Trans. Circuits Syst. Video Technol. 2009, 19, 261–272. [Google Scholar] [CrossRef]
- National Center for Statistics and Analysis. (2021, April). Distracted driving 2019 (Research Note. Report No. DOT HS 813 111). National Highway Traffic Safety Administration. Available online: https://crashstats.nhtsa.dot.gov/Api/Public/ViewPublication/813111 (accessed on 12 August 2022).
- University of North Carolina Highway Safety Research Center. Available online: https://www.hsrc.unc.edu/news/announcements/hsrc-to-lead-ncdot-center-of-excellence/ (accessed on 12 August 2022).
- Lu, F.; Sugano, Y.; Okabe, T.; Sato, Y. Adaptive Linear Regression for Appearance Based Gaze Estimation. IEEE Trans. Pattern Anal. Mach. Intell. 2014, 36, 2033–2046. [Google Scholar] [CrossRef] [PubMed]
- Zavan, F.H.; Nascimento, A.C.; Bellon, O.R.; Silva, L. Nose pose: A competitive, landmark-free methodology for head pose estimation in the wild. In Proceedings of the Conference on Graphics, Patterns and Images-W. Face Processing, Sao Paulo, Brazil, 4–7 October 2016. [Google Scholar]
- Hossain, M.A.; Assiri, B. An Enhanced Eye-Tracking Approach Using Pipeline Computation. Arab. J. Sci. Eng. 2020, 45, 3191–3204. [Google Scholar] [CrossRef]
- Svanera, M.; Muhammad, U.; Leonardi, R.; Benini, S. Figaro, hair detection and segmentation in the wild. In Proceedings of the IEEE International Conference on Image Processing (ICIP), Phoenix, AZ, USA, 25–28 September 2016; pp. 933–937. [Google Scholar]
- Wang, J.G.; Sung, E. Study on Eye Gaze Estimation. IEEE Trans. Syst. Man Cybern. 2002, 32, 332–350. [Google Scholar] [CrossRef] [PubMed]
- Ranjan, R.; Patel, V.M.; Chellappa, R. Hyperface: A deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 41, 121–135. [Google Scholar] [CrossRef][Green Version]
- Lee, H.; Iqbal, N.; Chang, W.; Lee, S.-Y. A Calibration Method for Eye Gaze Estimation System Based on 3D Geometrical Optics. IEEE Sens. J. 2013, 13, 3219–3225. [Google Scholar] [CrossRef]
- Zhu, Y.; Xue, Z.; Li, C. Automatic Head Pose Estimation with Synchronized Sub Manifold Embedding and Random Regression Forests. Int. J. Signal Process. Image Process. Pattern Recognit. 2014, 7, 123–134. [Google Scholar] [CrossRef]
- Erik Murphy-Chutorian, I.; Trivedi, M.M. Head Pose Estimation in Computer Vision: A Survey. IEEE Trans. Pattern Anal. Mach. Intell. 2009, 31, 607–626. [Google Scholar] [CrossRef] [PubMed]
- Fu, Y.; Huang, T.S. Graph Embedded Analysis for Head Pose Estimation. In Proceedings of the IEEE 7th International Conference on Automatic Face and Gesture Recognition (FGR’06), Southampton, UK, 10–12 April 2006; pp. 3–8. [Google Scholar]
- Khana, K.; Khan, R.U.; Leonardi, R.; Migliorati, P.; Beninic, S. Head pose estimation: A survey of the last ten years. Signal Process. Image Commun. 2021, 99, 116479. [Google Scholar] [CrossRef]
- Lu, D.; Weng, Q. A survey of image classification methods and techniques for improving classification performance. Int. J. Remote Sens. 2007, 28, 823–870. [Google Scholar] [CrossRef]
- Vankayalapati, H.D.; Anne, K.R.; Sri Harsha, S. Estimating Driver Attentiveness Through Head Pose Using Hybrid Geometric-Based Method. Smart Intell. Comput. Appl. 2022, 1, 197–204. [Google Scholar]
- Lal, S.K.; Craig, A.; Boord, P.; Kirkup, L.; Nguyen, H. Development of an algorithm for an EEG-based driver fatigue countermeasure. J. Saf. Res. 2003, 34, 321–328. [Google Scholar] [CrossRef]
- Li, C.; Tong, R.; Tang, M. Modelling Human Body Pose for Action Recognition Using Deep Neural Networks. Arab. J. Sci. Eng. 2018, 43, 7777–7788. [Google Scholar] [CrossRef]
- Kerr-Gaffney, J.; Harrison, A.; Tchanturia, K. Eye-tracking research in eating disorders: A systematic review. Int. J. Eat. Disord. 2019, 52, 3–27. [Google Scholar] [CrossRef] [PubMed][Green Version]
- Wijesoma, W.S.; Kodagoda, K.R.S.; Balasuriya, A.P. Road boundary detection and tracking using ladar sensing. IEEE Trans. Robot. Autom. 2004, 20, 456–464. [Google Scholar] [CrossRef]
- Zhang, X.; Sugano, Y.; Bulling, A. Evaluation of appearance-based methods and implications for gaze-based applications. In CHI Conference on Human Factors in Computing Systems, ser. CHI ’19; Association for Computing Machinery: New York, NY, USA, 2019. [Google Scholar]
- Zhu, X.; Lei, Z.; Liu, X.; Shi, H.; Li, S.Z. Face alignment across large poses: A 3D solution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 146–155. [Google Scholar]
- Dinges, D.F.; Grace, R. PERCLOS: A Valid Psychophysiological Measure of Alertness as Assessed by Psychomotor Vigilance; Highways; Operations and Traffic Management; Safety and Human Factors; I83: Accidents and the Human Factor; Federal Motor Carrier Safety Administration: Washington, DC, USA, 1999; FHWA-MCRT-98-006.
- Hiraiwa, J.; Vargas, E.; Toral, S. An FPGA based Embedded Vision System for Real-Time Motion Segmentation. In Proceedings of the 17th International Conference on Systems, Signals and Image Processing, Rio de Janeiro, Brazil, 17–19 June 2010; pp. 360–363. [Google Scholar]
- Videla, L.S.; Rao, M.R.; Anand, D.; Vankayalapati, H.D.; Razia, S. Deformable facial fitting using active appearance model for emotion recognition. In Smart Intelligent Computing and Applications; Smart Innovation, Systems and Technologies; Springer: Singapore, 2019; Volume 104, pp. 135–144. [Google Scholar]
- Xu, Y.; Jung, C.; Chang, Y. Head pose estimation using deep neural networks and 3D point clouds. Pattern Recognit. 2022, 121, 108210. [Google Scholar] [CrossRef]
- Yuan, H.; Li, M.; Hou, J.; Xiao, J. Single image-based head pose estimation with spherical parametrization and 3D morphing. Pattern Recognit. 2020, 103, 107316. [Google Scholar] [CrossRef]
- Jiang, N.; Yu, W.; Tang, S.; Goto, S. Cascade Detector for Rapid Face Detection. In Proceedings of the IEEE 7th International Colloquium on Signal Processing and its Applications, Penang, Malaysia, 4–6 March 2011; pp. 155–158. [Google Scholar]
- Khotanzad, A.; Hong, Y.H. Invariant Image Recognition by Zernike Moments. IEEE Trans. Pattern Anal. Mach. Intell. 1990, 12, 489–498. [Google Scholar] [CrossRef][Green Version]
- Aunsri, N.; Rattarom, S. Novel eye-based features for head pose-free gaze estimation with web camera: New model and low-cost device. Ain Shams Eng. J. 2022, 13, 101731. [Google Scholar] [CrossRef]
- Svensson, U. Blink Behaviour-Based Drowsiness Detection, Linkoping University; Swedish National Road and Transport Research Institute: Linkoping, Sweden, 2004.
- Barra, P.; Barra, S.; Bisogni, C.; de Marsico, M.; Nappi, M. Web-shaped model for head pose estimation: An approach for best exemplar selection. IEEE Trans. Image Process. 2020, 29, 5457–5468. [Google Scholar] [CrossRef]
- Ji, Q.; Zhu, Z.; Lan, P. Real Time Non-Intrusive Monitoring and Prediction of Driver Fatigue. IEEE Trans. Veh. Technol. 2004, 53, 1052–1068. [Google Scholar] [CrossRef]
- Vijayalakshmi, G.V.M.; Raj, A.N.J. Zernike Moments and Machine Learning Based Gender Classification Using Facial Images. In Proceedings of the Eighth International Conference on Soft Computing and Pattern Recognition, Vellore, India, 19–21 December 2016; Springer: Singapore, 2018; pp. 398–408. [Google Scholar]
- Paul, V.; Jones, M. Rapid object detection using a boosted cascade of simple features. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), Kauai, HI, USA, 8–14 December 2003; Volume 1. [Google Scholar]
- Hasan, S.Y. Study of Zernike Moments Using Analytical Zernike Polynomials. Adv. Appl. Sci. Res. 2012, 3, 583–590. [Google Scholar]
- Amayeh, G.; Kasaei, S.; Bebis, G.; Tavakkoli, A.; Veropoul, K. Improvement of Zernike Moment Descriptors on Affine Transformed Shapes. In Proceedings of the 9th International Symposium on Signal Processing and Its Applications ISSPA, Sharjah, United Arab Emirates, 12–15 February 2007. [Google Scholar]
- Fagertun, J.; Stegmann, M.B. Free Datasets for Statistical Models of Shape; Information and Mathematical Modeling, Technical University of Denmark, DTU. 2005. Available online: http://www.imm.dtu.dk/~aam/datasets/datasets.html (accessed on 13 February 2020).
- Weyrauch, B.; Huang, J.; Heisele, B.; Blanz, V. Component-based Face Recognition with 3D Morphable Models. In Proceedings of the IEEE Workshop on Face Processing in Video, Washington, DC, USA, 27 June–2 July 2004. [Google Scholar]
- Gross, R.; Li, S.Z.; Jain, A.K. Face Databases. 2020. Available online: https://www.face-rec.org/databases/ (accessed on 13 February 2020).
- Nordstrom, M.M.; Larsen, M.; Sierakowski, J.; Stegmann, M.B. The IMM Face Database—An Annotated Dataset of 240 Face Images; Informatics and Mathematical Modelling, Technical University of Denmark: Lyngby, Denmark, 2004. [Google Scholar]
|Head and facial movement analysis||Example 1: An infrared active sensor is used to detect both pupil and head motions in variable light conditions. By using the Kalman filter, facial features are tracked, and a smoothening of the motion of the features is ensured. The Gabor wavelets are used for fast feature detection .|
Example 2: First, the face and eyes are detected using the Adaboost classifier and then passed through a Gabor filter. The output is normalized and passed to a data-driven classifier, a support vector machine (SVM) [20,21].
Example 3: Human face and facial features are detected using Haar Wavelets with Adaboost cascade algorithm. Then, eye closing is measured by analyzing the optical flows in the particular region [13,22].
|Eye closure, blink rates||Example 1: The PERCLOS video-based system calculates the number of eyelid closures. It measures this within 1 to 3 min intervals. A special algorithm uses this number to estimate the person’s drowsiness. A person under drowsiness generally has a longer eye closure than an alert person. This system has been validated in real on-road driving and with the Psychomotor Vigilance test (PVT) [23,24,25].|
Example 2: Kinect cameras (passive stereo pair) are used to capture video of the human’s head to generate the 3D pose in real time. The persons face (±1 mm, ±1 deg) as well as the eye gaze direction (±3 deg), blink rates and eye closure are measured [23,26].
|Eye blink detection approach||S multi-sensor system was developed to integrate eye lid camera (measure eye blinks) and other monitoring parameters (a steering grip sensor, a lane tracker). All signals are integrated to perform the task [27,28,29,30]|
|Multiple measures approach||An example of alertness-monitoring technologies comprises the MINDS system and eye-gaze system to estimate the head position and gaze, two potential fitness-for-duty systems (Safety Scope and Mayo Pupillometry system). Using this system, one can measure various parameters such as physiological, behavioral and subjective sleepiness measures. Moreover, all these parameters were integrated using a neural-fuzzy hybrid scheme [31,32].|
|Database Name||Total No of Images||Left_90||Left_45||Front_0||Right_45||Right_90|
|Different Gaze Directions|
|Lower left||Lower middle||Lower right|
|Middle left||Middle (Front)||Middle right|
|Upper left||Upper middle||Upper right|
|Eyes Not Open||Half eyes open||One eye open|
|Elapsed Time (ms)||PCA||LDA|
|Half Eyes Open||7.43||7.37|
|One Eye Open||7.66||7.48|
|Proposed Head Pose Estimation||79%||87%|
|Proposed Gaze Direction Estimation||77%||85%|
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Vankayalapati, H.D.; Kuchibhotla, S.; Chadalavada, M.S.K.; Dargar, S.K.; Anne, K.R.; Kyamakya, K. A Novel Zernike Moment-Based Real-Time Head Pose and Gaze Estimation Framework for Accuracy-Sensitive Applications. Sensors 2022, 22, 8449. https://doi.org/10.3390/s22218449
Vankayalapati HD, Kuchibhotla S, Chadalavada MSK, Dargar SK, Anne KR, Kyamakya K. A Novel Zernike Moment-Based Real-Time Head Pose and Gaze Estimation Framework for Accuracy-Sensitive Applications. Sensors. 2022; 22(21):8449. https://doi.org/10.3390/s22218449Chicago/Turabian Style
Vankayalapati, Hima Deepthi, Swarna Kuchibhotla, Mohan Sai Kumar Chadalavada, Shashi Kant Dargar, Koteswara Rao Anne, and Kyandoghere Kyamakya. 2022. "A Novel Zernike Moment-Based Real-Time Head Pose and Gaze Estimation Framework for Accuracy-Sensitive Applications" Sensors 22, no. 21: 8449. https://doi.org/10.3390/s22218449