Human Pose Estimation from Monocular Images

Typ: Fortschritt-Berichte VDI
Erscheinungsdatum: 29.09.2020
Reihe: 10
Band Nummer: 869
Autor: Bastian Wandt M. Sc.
Ort: Hannover
ISBN: 978-3-18-386910-7
ISSN: 0178-9627
Erscheinungsjahr: 2020
Anzahl Seiten: 130
Anzahl Abbildungen: 47
Anzahl Tabellen: 8
Produktart: Buch (paperback DINA5)

Produktbeschreibung

Abstract

This dissertation deals with the problem of capturing human motions and poses using a single camera. The frst part of the thesis proposes two closely related approaches for the 3D reconstruction of human motions from image sequences. To resolve inherent ambiguities in monocular 3D reconstruction the main idea of this part is to exploit temporal properties of human motions in combination with a human body model learned from training data. The second part of the thesis tackles the problem of reconstructing a human pose from a single image. A human body model is learned by training a deep neural network that covers nonlinearities and anthropometric constraints.

C O N T E N T S
1 Introduction …..
1
1.1 Applications and Commercial Systems . . . . . . . . . . . 1
1.2 Image-based Motion Capture . . . . . . . . . . . . . . . . 2
1.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3.1 Time Consistent Human Motion Reconstruction . 6
1.3.2 RepNet . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4 Structure of the Thesis . . . . . . . . . . . . . . . . . . . . 7
1.5 List of Publications . . . . . . . . . . . . . . . . . . . . . . 10
1.5.1 Human Motion Capture . . . . . . . . . . . . . . . 10
1.5.2 Other Publications . . . . . . . . . . . . . . . . . . 13
2 Related work ….. 17
2.1 Non-rigid Structure-from-Motion . . . . . . . . . . . . . . 17
2.2 Single Image Approaches . . . . . . . . . . . . . . . . . . 18
2.2.1 Reprojection Error Optimization . . . . . . . . . . 19
2.2.2 Direct Inference using Neural Networks . . . . . . 19
2.3 Time Consistent Human Motion Capture . . . . . . . . . 20
3 Fundamentals ….. 22
3.1 Camera Models . . . . . . . . . . . . . . . . . . . . . . . . 22
3.1.1 Projective Transformations . . . . . . . . . . . . . 23
3.1.2 Intrinsic Parameters . . . . . . . . . . . . . . . . . 24
3.1.3 Extrinsic Parameters . . . . . . . . . . . . . . . . . 25
3.1.4 Simplified Camera Models . . . . . . . . . . . . . . 26
3.2 Human Pose Representations . . . . . . . . . . . . . . . . 28
3.2.1 Coordinate-based Representations . . . . . . . . . 28
3.2.2 Surface Mesh-based Representations . . . . . . . . 30
3.2.3 Subspaces of Human Poses . . . . . . . . . . . . . 31
3.3 Non-Rigid Structure from Motion . . . . . . . . . . . . . . 33
3.4 Error Metrics . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.5 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4 Exploiting temporal properties ….. 40
4.1 Periodic and Non-periodic Constraints . . . . . . . . . . . 41
4.1.1 Factorization model . . . . . . . . . . . . . . . . . 44
4.1.2 Camera Parameter Estimation . . . . . . . . . . . 45
4.1.3 Periodic Motion . . . . . . . . . . . . . . . . . . . 47
4.1.4 Non-Periodic Motion . . . . . . . . . . . . . . . . . 48
4.1.5 Algorithm . . . . . . . . . . . . . . . . . . . . . . . 50
4.1.6 Experimental Results . . . . . . . . . . . . . . . . 51
4.1.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . 63
4.2 A Novel Kinematic Chain Space . . . . . . . . . . . . . . 65
4.2.1 Estimating Camera and Shape . . . . . . . . . . . 66

4.2.2 Kinematic Chain Space . . . . . . . . . . . . . . . 67
4.2.3 Trace Norm Constraint . . . . . . . . . . . . . . . 68
4.2.4 Camera . . . . . . . . . . . . . . . . . . . . . . . . 71
4.2.5 Algorithm . . . . . . . . . . . . . . . . . . . . . . . 71
4.2.6 Experiments . . . . . . . . . . . . . . . . . . . . . 71
4.2.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . 79
5 Single image reconstruction using adversarial
training………..
80
5.1 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.2 Pose and Camera Estimation . . . . . . . . . . . . . . . . 83
5.3 Reprojection Layer . . . . . . . . . . . . . . . . . . . . . . 83
5.4 Critic Network . . . . . . . . . . . . . . . . . . . . . . . . 84
5.5 Camera . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.6 Data Preprocessing . . . . . . . . . . . . . . . . . . . . . . 86
5.7 Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
5.8 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
5.8.1 Quantitative Evaluation on Human3.6M . . . . . . 87
5.8.2 Quantitative Evaluation on MPI-INF-3DHP . . . . 91
5.8.3 Plausibility of the Reconstructions . . . . . . . . . 92
5.8.4 Noisy observations . . . . . . . . . . . . . . . . . . 93
5.8.5 Qualitative Evaluation . . . . . . . . . . . . . . . . 94
5.8.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . 94
6 Conclusions ….. 97
Bibliography ….. 101

Keywords: Human Pose Estimation, 3D Reconstruction, Monocular Cameras, Structure From Motion, Universität Hannover, TNT,

52,00 € inkl. MwSt.
VDI-Mitgliedspreis:*
46,80 € inkl. MwSt.

* Der VDI-Mitgliedsrabatt gilt nur für Privatpersonen