Robust Pose Estimation of Pedestrians with a Deep Neural Networks

Chuho Yi; Jungwon Cho

doi:10.18517/ijaseit.13.4.19022

Robust Pose Estimation of Pedestrians with a Deep Neural Networks

Chuho Yi, Jungwon Cho

Abstract

In this paper, we provide a method for robust estimation of pedestrian pose that is especially useful for autonomous vehicles traveling toward pedestrians far away. Pedestrians in the far distance appear relatively small when seen by a camera, making it difficult to estimate the pedestrian's pose. We use fused deep neural networks (DNNs) to resolve the problems presented by pedestrians in the far distance. First, DNNs are used to detect pedestrians and enlarge the observed image. Next, the DNN method of pose estimation is applied. The proposed method uses a single camera to estimate the posture of a pedestrian in the far distance. Far-off pedestrians observed by cameras in moving cars appear as low-resolution images of non-rigid bodies. Detection and orientation estimation are difficult with conventional image processing methods. We used a series of DNNs to detect pedestrians, improve data availability, and estimate challenging postures to address these limitations. In this paper, we propose a method based on the multi-stage fusion of DNNs to solve a difficult problem for a single DNN. The experimental results established the superiority of the proposed method when applied to data challenging for conventional pose estimation methods. Applications of the proposed method include observing small objects and objects in the far distance. The method may be especially useful in surveillance systems, sports broadcasting, and other applications requiring human posture estimation.

Keywords

Pose estimation; pedestrian; deep neural network (DNN); super resolution; mono camera-based estimation

Full Text:

PDF

References

G. Papandreou, T. Zhu, N. Kanazawa, A. Toshev, J. Tompson, C. Bregler, and K. Murphy, â€œTowards accurate multi-person pose estimation in the wild,â€ In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4903-4911, 2017.

H. Wang, R. A. GÃ¼ler, I. Kokkinos, G. Papandreou, and S. Zafeiriou, â€œBLSM: A bone-level skinned model of the human mesh,â€ In Computer Visionâ€“ECCV 2020: 16th European Conference, Glasgow, UK, August 23â€“28, 2020, Proceedings, Part V 16, pp. 1-17, 2020.

Z. Cao, T. Simon, S. E. Wei, and Y. Sheikh, â€œRealtime multi-person 2d pose estimation using part affinity fields,â€ In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7291-7299, 2017.

Z. Cao, G. Hidalgo, T. Simon, S. E. Wei, and Y. Sheikh, â€œOpenPose: realtime multi-person 2D pose estimation using Part Affinity Fields,â€ IEEE transactions on pattern analysis and machine intelligence, 43(1), pp. 172-186, 2021.

Z. Cao, H. Gao, K. Mangalam, Q. Z. Cai, M. Vo, and J. Malik, â€œLong-term human motion prediction with scene context,â€ In Computer Visionâ€“ECCV 2020: 16th European Conference, pp. 387-404, 2020.

H. S. Fang, S. Xie, Y. W. Tai, and C. Lu, â€œRmpe: Regional multi-person pose estimation,â€ In Proceedings of the IEEE International Conference on Computer Vision, pp. 2334-2343, 2017.

J. Li, C. Wang, H. Zhu, Y. Mao, H. S. Fang, and C. Lu, â€œCrowdpose: Efficient crowded scenes pose estimation and a new benchmark,â€ In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 10863-10872, 2019.

H. S. Fang, J. Li, H. Tang, C. Xu, H. Zhu, Y. Xiu, and C. Lu, â€œAlphapose: Whole-body regional multi-person pose estimation and tracking in real-time,â€ IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022.

J. Sun, Y. Li, L. Chai, H. S. Fang, Y. L. Li, and C. Lu, â€œHuman trajectory prediction with momentary observation,â€ In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6467-6476, 2022.

H. S. Fang, Y. Xie, D. Shao, Y. L. Li, and C. Lu, â€œDecAug: augmenting HOI detection via decomposition,â€ In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35, No. 2, pp. 1300-1308, 2021.

H. S. Fang, Y. Xie, D. Shao, and C. Lu, â€œDirv: Dense interaction region voting for end-to-end human-object interaction detection,â€ In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35, No. 2, pp. 1291-1299, 2021.

C. Ledig, L. Theis, F. HuszÃ¡r, J. Caballero, A. Cunningham, A. Acosta, and W. Shi, â€œPhoto-realistic single image super-resolution using a generative adversarial network,â€ In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4681-4690, 2017.

J. M. Wolterink, K. Kamnitsas, C. Ledig, and I. IÅ¡gum, â€œDeep learning: Generative adversarial networks and adversarial methods,â€ In Handbook of Medical Image Computing and Computer Assisted Intervention, pp. 547-574, 2020.

C. Rockwell, D. F. Fouhey, and J. Johnson, â€œPixelsynth: Generating a 3d-consistent experience from a single image,â€ In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 14104-14113, 2021.

S. Kreiss, L. Bertoni, and A. Alahi, â€œPifpaf: Composite fields for human pose estimation,â€ In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 11977-11986, 2019.

W. Shi, J. Caballero, F. HuszÃ¡r, J. Totz, A. P. Aitken, R. Bishop, and Z. Wang, â€œReal-time single image and video super-resolution using an efficient sub-pixel convolutional neural network,â€ In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1874-1883, 2016.

C. Ouyang, J. Schlemper, C. Biffi, G. Seegoolam, J. Caballero, A. N. Price, and D. Rueckert, â€œGeneralising deep learning MRI reconstruction across different domains,â€ arXiv preprint arXiv:1902.10815, 2019.

S. Park, J. Yoo, D. Cho, J. Kim, and T. H. Kim, â€œFast adaptation to super-resolution networks via meta-learning,â€ In Computer Visionâ€“ECCV 2020: 16th European Conference, Proceedings, Part XXVII 16, pp. 754-769, 2020.

S. Lee, D. Cho, J. Kim, and T. H. Kim, â€œRestore from restored: Video restoration with pseudo clean video,â€ In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3537-3546, 2021.

S. Lee, J. Kim, J. S. Yoon, S. Shin, O. Bailo, N. Kim, and I. S. Kweon, â€œVpgnet: Vanishing point guided network for lane and road marking detection and recognition,â€ In Proceedings of the IEEE international conference on computer vision, pp. 1947-1955, 2017.

G. Chen, K. Chen, L. Zhang, L. Zhang, and A. Knoll, â€œVCANet: Vanishing-point-guided context-aware network for small road object detection,â€ Automotive Innovation, 4, pp. 400-412, 2021.

X. Li, L. Zhu, Z. Yu, B. Guo, and Y. Wan, â€œVanishing point detection and rail segmentation based on deep multi-task learning,â€ IEEE Access, 8, pp. 163015-163025, 2020.

W. Wang, P. Lu, X. Peng, W. Yin, and Z. Zhao, â€œRLSCNet: A Residual Line-Shaped Convolutional Network for Vanishing Point Detection,â€ In MultiMedia Modeling: 29th International Conference, MMM 2023, pp. 103-114, 2023.

G. Welch and G. Bishop, An introduction to the Kalman filter, 1995.

M. Khodarahmi and V. Maihami, â€œA review on Kalman filter models,â€ Archives of Computational Methods in Engineering, 30(1), pp. 727-747, 2023.

A. Rasouli, I. Kotseruba, and J. K. Tsotsos, â€œAre they going to cross? A benchmark dataset and baseline for pedestrian crosswalk behavior,â€ In Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 206-213, 2017.

I. Kotseruba, A. Rasouli, and J. K. Tsotsos, â€œBenchmark for evaluating pedestrian action prediction,â€ In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1258-1268, 2021.

B. Liu, E. Adeli, Z. Cao, K. H. Lee, A. Shenoi, A. Gaidon, and J. C. Niebles, â€œSpatiotemporal relationship reasoning for pedestrian intent prediction,â€ IEEE Robotics and Automation Letters, 5(2), pp. 3485-3492, 2020.

B. Yang, W. Zhan, P. Wang, C. Chan, Y. Cai, and N. Wang, â€œCrossing or not? Context-based recognition of pedestrian crossing intention in the urban environment,â€ IEEE Transactions on Intelligent Transportation Systems, 23(6), pp. 5338-5349, 2021.

DOI: http://dx.doi.org/10.18517/ijaseit.13.4.19022

Refbacks

There are currently no refbacks.

Published by INSIGHT - Indonesian Society for Knowledge and Human Development

International Journal on Advanced Science, Engineering and Information Technology

Robust Pose Estimation of Pedestrians with a Deep Neural Networks

Abstract

Keywords

Full Text:

References

Refbacks