Masterarbeit MSTR-2024-97

Bibliograph.
Daten
Holtz, David: Neural Rendering for Sensor Adaptation in 3D Object Detection.
Universität Stuttgart, Fakultät Informatik, Elektrotechnik und Informationstechnik, Masterarbeit Nr. 97 (2024).
101 Seiten, englisch.
Kurzfassung

Autonomous vehicles are equipped with multiple cameras to perceive their surroundings and make informed driving decisions. A key perception task is 3D object detection, which involves localizing objects in the 3D space using sensor data such as camera images. However, different vehicle models often have varying camera setups due to restricted sensor placements or hardware configurations, leading to a so-called cross-sensor domain gap. As autonomous driving technology scales across fleets, understanding the impact of this gap on object detection performance becomes increasingly important. Yet, this issue has been scarcely investigated. Neural Radiance Fields (NeRFs) have shown great potential in generating novel views of 3D scenes. In this thesis, we propose a novel sensor adaptation method that leverages NeRFs for transforming video recordings of street view scenes from one camera setup to another. By using these transformed scenes, we can train 3D object detectors specifically tailored to the deployed camera setups. To this end, we extend a recent NeRF model to enhance the rendering of dynamic actors while generalizing well to novel views and being highly scalable. To evaluate our approach, we generate a synthetic dataset comprising 850 street view scenes recorded from two different camera setups. Based on this dataset, we systematically investigate the impact of the cross-sensor domain gap on the performance of three state-of-the-art 3D object detectors, as well as the effectiveness of our approach in mitigating this gap. Our experiments show that the performance of 3D object detectors degrades significantly in presence of cross-sensor domain gaps. Furthermore, we demonstrate that our sensor adaptation method can effectively mitigate this degradation for transformer-based 3D object detectors, achieving performance close to detectors trained on actual target sensor data. These results suggest that our method has the potential to relax the sensor adaptation problem to a computational task: This approach scales well across a fleet with diverse sensor setups as it reduces the need for costly data collection for each platform.

Abteilung(en)Universität Stuttgart, Institut für Visualisierung und Interaktive Systeme, Visualisierung und Interaktive Systeme
BetreuerBruhn, Prof. Andrés; Uhrig, Jonas
Eingabedatum14. März 2025
   Publ. Informatik