Master Thesis MSTR-2024-97

BibliographyHoltz, David: Neural Rendering for Sensor Adaptation in 3D Object Detection.
University of Stuttgart, Faculty of Computer Science, Electrical Engineering, and Information Technology, Master Thesis No. 97 (2024).
101 pages, english.
Abstract

Autonomous vehicles are equipped with multiple cameras to perceive their surroundings and make informed driving decisions. A key perception task is 3D object detection, which involves localizing objects in the 3D space using sensor data such as camera images. However, different vehicle models often have varying camera setups due to restricted sensor placements or hardware configurations, leading to a so-called cross-sensor domain gap. As autonomous driving technology scales across fleets, understanding the impact of this gap on object detection performance becomes increasingly important. Yet, this issue has been scarcely investigated. Neural Radiance Fields (NeRFs) have shown great potential in generating novel views of 3D scenes. In this thesis, we propose a novel sensor adaptation method that leverages NeRFs for transforming video recordings of street view scenes from one camera setup to another. By using these transformed scenes, we can train 3D object detectors specifically tailored to the deployed camera setups. To this end, we extend a recent NeRF model to enhance the rendering of dynamic actors while generalizing well to novel views and being highly scalable. To evaluate our approach, we generate a synthetic dataset comprising 850 street view scenes recorded from two different camera setups. Based on this dataset, we systematically investigate the impact of the cross-sensor domain gap on the performance of three state-of-the-art 3D object detectors, as well as the effectiveness of our approach in mitigating this gap. Our experiments show that the performance of 3D object detectors degrades significantly in presence of cross-sensor domain gaps. Furthermore, we demonstrate that our sensor adaptation method can effectively mitigate this degradation for transformer-based 3D object detectors, achieving performance close to detectors trained on actual target sensor data. These results suggest that our method has the potential to relax the sensor adaptation problem to a computational task: This approach scales well across a fleet with diverse sensor setups as it reduces the need for costly data collection for each platform.

Department(s)University of Stuttgart, Institute of Visualisation and Interactive Systems, Visualisation and Interactive Systems
Superviser(s)Bruhn, Prof. Andrés; Uhrig, Jonas
Entry dateMarch 14, 2025
New Report   New Article   New Monograph   Computer Science