Master Thesis MSTR-2021-111

BibliographyStumber, Jonathan P.: CNN-based 6D pose estimation of vehicles for automated driving based on mono camera images.
University of Stuttgart, Faculty of Computer Science, Electrical Engineering, and Information Technology, Master Thesis No. 111 (2021).
67 pages, english.

Estimating 3D properties such as 3D position, orientation and extent of objects from a single mono camera image is a challenging and ill-posed task, but it is very important for autonomous driving technology. In this thesis, ‘Pix2Pose’, a recent 6D pose estimation method based on an auto-encoder architecture and trained as a GAN, was tested on the KITTI dataset. The CNN was trained to estimate the 3D object local coordinates per pixel of all 3D bounding boxes corresponding to a vehicle in an image. From those 2D-3D point correspondences the pose was estimated by a P=P solver with RANSAC. This pose estimation is combined with an additional estimate of the extent to form a 3D bounding box prediction. The original Pix2Pose network with predictions on cropped image parts did outperform a variant of a larger Pix2Pose network with predictions on the full image. However, the accuracy of the pose estimation is limited due to mismatches for occluded 3D bounding boxes, even if rendered 3D object coordinates are used.

Department(s)University of Stuttgart, Institute of Visualisation and Interactive Systems, Visualisation and Interactive Systems
Superviser(s)Bruhn, Prof. Andres; Ertl, Prof. thomas; Hermann, Dr.-Ing Christian; Cagman, Can
Entry dateApril 18, 2023
   Publ. Computer Science