Gait Recognition by Combining the Long-Short-Term Attention Network and Personal Physiological Features
Abstract
:1. Introduction
2. Methods
2.1. System Overview
2.2. Gait-Silhouette-Based Attention Modules
2.2.1. Local Short-Term Attention (LSTA)
2.2.2. Global Long-Term Attention (GLTA)
2.2.3. Adaptive Temporal Feature Aggregation (ATFA)
2.3. Personal Physiological Feature Module
2.3.1. Estimating Human Physiological Information (HPI)
2.3.2. Physiological Feature Extraction (PFE) Module
2.4. Loss Function
3. Results
3.1. Datasets and Training Details
3.2. Efficiency Evaluation of Physiological Information Computing
3.3. Comparison with State-of-the-Art Methods
3.3.1. Comparative Experiments on CASIA-B Dataset
3.3.2. Comparative Experiment on the Multi-State Gait Dataset
3.4. Ablation Study
3.5. Transplantation Study
4. Discussion
- (1)
- Experiments on the other large public datasets (such as the OUMVLP Dataset) should be performed. Due to the limitations of hardware, the authors cannot test the proposed method on such large datasets. It is believed that such experiments could be achieved with more powerful GPU hardware.
- (2)
- Determining how to extract more accurate HPI features should be investigated. Currently, since only the monocular images were applied, the skeleton points of a person may be invisible due to variation of view angles. The 3D skeleton points are considered be a solution to this problem, and such points could be obtained through the RGB-D camera, stereo vision, or other 2D–3D neural networks through successive frames.
- (3)
- More clear test images should be applied in future research. As the motion blur has caused many experimental errors in the work (because the estimation of skeleton points becomes unstable), the proposed method should be applied to the test images obtained through high-speed cameras rather than the normal ones (such as the shutter speed of 20 fps).
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Liu, J.; Zheng, N. Gait history image: A novel temporal template for gait recognition. In Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, Beijing, China, 2–5 July 2007; IEEE: Beijing, China, 2007; pp. 663–666. [Google Scholar]
- Singh, S.; Biswas, K. Biometric gait recognition with carrying and clothing variants. In Proceedings of the International Conference on Pattern Recognition and Machine Intelligence, Delhi, India, 16–20 December 2009; Springer: Berlin/Heidelberg, Germany, 2009; pp. 446–451. [Google Scholar]
- Huang, S.; Elgammal, A.; Lu, J.; Yang, D. Cross-speed gait recognition using speed-invariant gait templates and globality–locality preserving projections. IEEE Trans. Inf. Forensics Secur. 2015, 10, 2071–2083. [Google Scholar] [CrossRef]
- Shiraga, K.; Makihara, Y.; Muramatsu, D.; Echigo, T.; Yagi, Y. Geinet: View-invariant gait recognition using a convolutional neural network. In Proceedings of the 2016 International Conference on Biometrics (ICB), Halmstad, Sweden, 13–16 June 2016; IEEE: Halmstad, Sweden, 2016; pp. 1–8. [Google Scholar]
- Chao, H.; He, Y.; Zhang, J.; Feng, J. Gaitset: Regarding gait as a set for cross-view gait recognition. In Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 27 January–1 February 2019; Volume 33, pp. 8126–8133. [Google Scholar]
- Fan, C.; Peng, Y.; Cao, C.; Liu, X.; Hou, S.; Chi, J.; Huang, Y.; Li, Q.; He, Z. Gaitpart: Temporal part-based model for gait recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 14–19 June 2020; pp. 14225–14233. [Google Scholar]
- Liao, R.; Cao, C.; Garcia, E.B.; Yu, S.; Huang, Y. Pose-based temporal-spatial network (PTSN) for gait recognition with carrying and clothing variations. In Proceedings of the Chinese Conference on Biometric Recognition, Shenzhen, China, 28–29 October 2017; Springer: Berlin/Heidelberg, Germany, 2017; pp. 474–483. [Google Scholar]
- Wu, Z.; Huang, Y.; Wang, L.; Wang, X.; Tan, T. A comprehensive study on cross-view gait based human identification with deep cnns. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 39, 209–226. [Google Scholar] [CrossRef] [PubMed]
- Wolf, T.; Babaee, M.; Rigoll, G. Multi-view gait recognition using 3D convolutional neural networks. In Proceedings of the 2016 IEEE International Conference on Image Processing (ICIP), Phoenix, AZ, USA, 25–28 September 2016; IEEE: Phoenix, AZ, USA, 2016; pp. 4165–4169. [Google Scholar]
- Thapar, D.; Nigam, A.; Aggarwal, D.; Agarwal, P. VGR-net: A view invariant gait recognition network. In Proceedings of the 2018 IEEE 4th International Conference on Identity, Security, and Behavior Analysis (ISBA), Singapore, 11–12 January 2018; IEEE: Singapore, 2018; pp. 1–8. [Google Scholar]
- Liao, R.; Yu, S.; An, W.; Huang, Y. A model-based gait recognition method with body pose and human prior knowledge. Pattern Recognit. 2020, 98, 107069. [Google Scholar] [CrossRef]
- Feng, Y.; Li, Y.; Luo, J. Learning effective gait features using LSTM. In Proceedings of the 2016 23rd International Conference on Pattern Recognition (ICPR), Cancun, Mexico, 4–8 December 2016; IEEE: Cancun, Mexico, 2016; pp. 325–330. [Google Scholar]
- Yu, S.; Chen, H.; Garcia Reyes, E.B.; Poh, N. Gaitgan: Invariant gait feature extraction using generative adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, USA, 21–26 July 2017; pp. 30–37. [Google Scholar]
- Han, J.; Bhanu, B. Individual recognition using gait energy image. IEEE Trans. Pattern Anal. Mach. Intell. 2005, 28, 316–322. [Google Scholar] [CrossRef] [PubMed]
- Tran, D.; Bourdev, L.; Fergus, R.; Torresani, L.; Paluri, M. Learning spatiotemporal features with 3d convolutional networks. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 4489–4497. [Google Scholar]
- Schuster, M.; Paliwal, K.K. Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 1997, 45, 2673–2681. [Google Scholar] [CrossRef] [Green Version]
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
- BenAbdelkader, C.; Cutler, R.; Davis, L. View-invariant estimation of height and stride for gait recognition. In Proceedings of the International Workshop on Biometric Authentication, Copenhagen, Denmark, 1 June 2002; Springer: Berlin/Heidelberg, Germany, 2002; pp. 155–167. [Google Scholar]
- Moustakas, K.; Tzovaras, D.; Stavropoulos, G. Gait recognition using geometric features and soft biometrics. IEEE Signal Process. Lett. 2010, 17, 367–370. [Google Scholar] [CrossRef]
- Yu, S.; Tan, D.; Tan, T. A framework for evaluating the effect of view angle, clothing and carrying condition on gait recognition. In Proceedings of the 18th International Conference on Pattern Recognition (ICPR’06), Hong Kong, China, 20–24 August 2006; IEEE: Hong Kong, China, 2006; Volume 4, pp. 441–444. [Google Scholar]
- Jocher, G. Ultralytics. Yolov5. Available online: https://github.com/ultralytics/yolov5 (accessed on 1 May 2021).
- Hoiem, D.; Efros, A.A.; Hebert, M. Putting objects in perspective. Int. J. Comput. Vis. 2008, 80, 3–15. [Google Scholar] [CrossRef]
- Cao, Z.; Simon, T.; Wei, S.E.; Sheikh, Y. Realtime multi-person 2d pose estimation using part affinity fields. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 7291–7299. [Google Scholar]
- Hermans, A.; Beyer, L.; Leibe, B. In defense of the triplet loss for person re-identification. arXiv 2017, arXiv:1703.07737. [Google Scholar]
- Zhang, Z.; Tran, L.; Yin, X.; Atoum, Y.; Liu, X.; Wan, J.; Wang, N. Gait recognition via disentangled representation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 4710–4719. [Google Scholar]
- He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask r-cnn. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2961–2969. [Google Scholar]
- Takemura, N.; Makihara, Y.; Muramatsu, D.; Echigo, T.; Yagi, Y. Multi-view large population gait dataset and its performance evaluation for cross-view gait recognition. IPSJ Trans. Comput. Vis. Appl. 2018, 10, 1–14. [Google Scholar] [CrossRef]
Person | Estimation Error | ||||||||
---|---|---|---|---|---|---|---|---|---|
30 | 45 | 60 | 30 | 45 | 60 | 30 | 45 | 60 | |
Elbow Angle | Knee Angle | Hunchback Angle | |||||||
ID1 | 6.6% | 4.6% | 4.7% | 5.8% | 5.4% | 6.1% | 10.5% | 9.7% | 11.2% |
ID2 | 8.6% | 4.8% | 10.7% | 8.4% | 6.5% | 9.3% | 7.7% | 6.7% | 9.6% |
ID3 | 5.6% | 5.9% | 5.9% | 6.3% | 3.8% | 3.3% | 12.7% | 8.2% | 8.7% |
ID4 | 4.4% | 6.7% | 6.4% | 4.5% | 6.5% | 5.5% | 8.6% | 9.8% | 8.9% |
Mean | 6.2% | 6.0% | 9.4% |
Person | Estimation Error | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
30 | 45 | 60 | 30 | 45 | 60 | 30 | 45 | 60 | 30 | 45 | 60 | |
Height | Shoulder Width | Step Length | Step Frequency | |||||||||
ID1 | 1.1% | 0.9% | 1.0% | 5.1% | 2.5% | 7.6% | 9.4% | 8.8% | 8.2% | 6.5% | 2.3% | 5.8% |
ID2 | 2.1% | 1.1% | 0.5% | 1.1% | 12.1% | 7.4% | 7.5% | 8.2% | 7.7% | 4.3% | 6.6% | 6.6% |
ID3 | 0.9% | 2.8% | 2.2% | 0.8% | 2.8% | 5.7% | 10.2% | 5.5% | 7.5% | 6.3% | 6.7% | 5.9% |
ID4 | 0.7% | 0.9% | 0.3% | 2.3% | 3.3% | 5.5% | 7.7% | 7.9% | 8.4% | 2.2% | 5.2% | 4.3% |
Mean | 1.2% | 4.7% | 8.1% | 5.2% |
Gallery NM#1–4 | 0–180 | Mean | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Probe | 0 | 18 | 36 | 54 | 72 | 90 | 108 | 126 | 144 | 162 | 180 | ||
NM #5–6 | CNN-LB [8] | 82.6 | 90.3 | 96.1 | 94.3 | 90.1 | 87.4 | 89.9 | 94.0 | 94.7 | 91.3 | 78.5 | 89.9 |
GaitSet [5] | 90.8 | 97.9 | 99.4 | 96.9 | 93.6 | 91.7 | 95.0 | 97.8 | 98.9 | 96.8 | 85.8 | 95.0 | |
GaitNet [25] | 91.2 | 92.0 | 90.5 | 95.6 | 86.9 | 92.6 | 93.5 | 96.0 | 90.9 | 88.8 | 89.0 | 91.6 | |
GaitPart [6] | 94.1 | 98.6 | 99.3 | 98.5 | 94.0 | 92.3 | 95.9 | 98.4 | 99.2 | 97.8 | 90.4 | 96.2 | |
ours | 93.4 | 98.4 | 99.3 | 98.4 | 95.1 | 93.2 | 96.4 | 98.3 | 99.4 | 97.9 | 92.2 | 96.5 | |
BG #1–2 | CNN-LB [8] | 64.2 | 80.6 | 82.7 | 76.9 | 64.8 | 63.1 | 68.0 | 76.9 | 82.2 | 75.4 | 61.3 | 72.4 |
GaitSet [5] | 83.8 | 91.2 | 91.8 | 88.8 | 83.3 | 81.0 | 84.1 | 90.0 | 92.2 | 94.4 | 79.0 | 87.2 | |
GaitNet [25] | 83.0 | 87.8 | 88.3 | 93.3 | 82.6 | 74.8 | 89.5 | 91.0 | 86.1 | 81.2 | 85.6 | 85.7 | |
GaitPart [6] | 89.1 | 94.8 | 96.7 | 95.1 | 88.3 | 84.9 | 89.0 | 93.5 | 96.1 | 93.8 | 85.8 | 91.5 | |
ours | 90.1 | 96.1 | 97.0 | 95.0 | 90.6 | 85.4 | 90.7 | 94.8 | 97.5 | 94.3 | 87.1 | 92.6 | |
CL #1–2 | CNN-LB [8] | 37.7 | 57.2 | 66.6 | 61.1 | 55.2 | 54.6 | 55.2 | 59.1 | 58.9 | 48.8 | 39.4 | 54.0 |
GaitSet [5] | 61.4 | 75.4 | 80.7 | 77.3 | 72.1 | 70.1 | 71.5 | 73.5 | 73.5 | 68.4 | 50.0 | 70.4 | |
GaitNet [25] | 42.1 | 58.2 | 65.1 | 70.7 | 68.0 | 70.6 | 65.3 | 69.4 | 51.5 | 50.1 | 36.6 | 58.9 | |
GaitPart [6] | 70.7 | 85.5 | 86.9 | 83.3 | 77.1 | 72.5 | 76.9 | 82.2 | 83.8 | 80.2 | 66.5 | 78.7 | |
ours | 71.2 | 84.4 | 86.7 | 83.3 | 79.6 | 76.6 | 79.3 | 83.8 | 85.0 | 80.9 | 67.2 | 79.8 |
Gallery NM #1–4 | 0–270 | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Probe | Model | 0 | 15 | 30 | 45 | 60 | 75 | 90 | 180 | 195 | 210 | 225 | 240 | 255 | 270 | MEAN |
NM #5–6 | GaitSet [5] | 79.7 | 87.4 | 91.8 | 96.2 | 91.8 | 76.9 | 77.5 | 79.7 | 87.9 | 94.0 | 92.9 | 91.2 | 81.3 | 76.9 | 86.1 |
GaitPart [6] | 76.4 | 89.0 | 96.7 | 96.7 | 95.1 | 87.9 | 86.8 | 76.9 | 91.8 | 97.8 | 99.5 | 99.5 | 90.7 | 87.9 | 90.9 | |
ours_1 | 81.9 | 91.2 | 96.7 | 96.2 | 95.6 | 87.9 | 87.4 | 75.8 | 92.9 | 95.1 | 97.8 | 97.3 | 90.7 | 84.6 | 90.7 | |
ours_2 | 81.9 | 91.2 | 96.7 | 96.7 | 95.6 | 88.5 | 87.9 | 75.8 | 92.9 | 95.1 | 97.8 | 97.3 | 90.7 | 85.7 | 91.0 | |
BG #1–4 | GaitSet [5] | 73.9 | 81.2 | 89.3 | 91.5 | 87.1 | 71.4 | 75.8 | 78.3 | 83.2 | 90.1 | 90.7 | 85.7 | 75.8 | 80.0 | 82.4 |
GaitPart [6] | 69.5 | 84.3 | 91.8 | 94.8 | 86.8 | 78.0 | 78.9 | 66.2 | 81.2 | 92.6 | 93.1 | 89.6 | 86.8 | 79.4 | 83.8 | |
ours_1 | 70.6 | 85.5 | 92.6 | 93.1 | 91.2 | 81.0 | 81.6 | 72.8 | 88.0 | 92.3 | 92.3 | 87.1 | 83.5 | 81.9 | 85.3 | |
ours_2 | 70.6 | 86.3 | 93.7 | 93.7 | 91.2 | 82.1 | 81.6 | 73.0 | 88.9 | 93.4 | 92.6 | 88.3 | 84.9 | 82.1 | 85.9 | |
CL #1–4 | GaitSet [5] | 67.6 | 76.9 | 76.4 | 75.8 | 72.5 | 67.6 | 60.4 | 62.1 | 70.9 | 74.2 | 74.2 | 74.7 | 55.0 | 61.0 | 69.2 |
GaitPart [6] | 73.0 | 72.4 | 82.3 | 84.0 | 80.1 | 71.9 | 74.1 | 67.5 | 70.2 | 81.8 | 75.7 | 64.2 | 58.7 | 64.7 | 72.9 | |
ours_1 | 73.6 | 78.0 | 87.9 | 85.2 | 75.8 | 70.9 | 68.7 | 63.7 | 76.9 | 83.0 | 80.2 | 74.7 | 61.5 | 63.2 | 74.5 | |
ours_2 | 73.7 | 78.6 | 87.9 | 86.3 | 76.9 | 72.5 | 69.2 | 63.8 | 78.0 | 84.1 | 80.8 | 75.3 | 63.2 | 63.7 | 75.3 |
Model | Rank-1% | |||
---|---|---|---|---|
NM | BG | CL | Mean | |
GaitSet [5] | 95.0 | 87.2 | 70.4 | 84.2 |
GaitPart [6] | 96.2 | 91.5 | 78.7 | 88.8 |
our | ||||
Baseline (LSTA) | 96.4 | 91.4 | 77.4 | 88.4 |
Baseline (LSTA) + GLTA | 96.7 | 91.8 | 79.2 | 89.2 |
Baseline (LSTA) + GLTA + ATFA | 96.5 | 92.6 | 79.8 | 89.6 |
Model | Rank-1% | |||
---|---|---|---|---|
NM | BG | CL | Mean | |
GaitSet [5] | 86.07 | 82.43 | 69.23 | 79.24 |
GaitPart [6] | 90.89 | 83.78 | 72.90 | 82.52 |
our | ||||
Baseline | 90.74 | 85.25 | 74.53 | 83.51 |
Baseline + HPI | 90.81 | 85.34 | 74.64 | 83.60 |
Baseline + HPI + PFE | 90.97 | 85.89 | 75.28 | 84.05 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Hua, C.; Pan, Y.; Li, J.; Wang, Z. Gait Recognition by Combining the Long-Short-Term Attention Network and Personal Physiological Features. Sensors 2022, 22, 8779. https://doi.org/10.3390/s22228779
Hua C, Pan Y, Li J, Wang Z. Gait Recognition by Combining the Long-Short-Term Attention Network and Personal Physiological Features. Sensors. 2022; 22(22):8779. https://doi.org/10.3390/s22228779
Chicago/Turabian StyleHua, Chunsheng, Yingjie Pan, Jia Li, and Zhibo Wang. 2022. "Gait Recognition by Combining the Long-Short-Term Attention Network and Personal Physiological Features" Sensors 22, no. 22: 8779. https://doi.org/10.3390/s22228779