Master Thesis MSTR-2023-112

BibliographyMobasher, Anas: A novel NeRF-based approach for extracting single objects and generating synthetic training data for grasp prediction models.
University of Stuttgart, Faculty of Computer Science, Electrical Engineering, and Information Technology, Master Thesis No. 112 (2023).
62 pages, english.
Abstract

One of the main challenges in robotic manipulation is to grasp previously unseen objects without prior knowledge. State-of-the-art methods rely on dedicated machine learning models which are trained on RGB-Depth (RGB-D) images and annotated labels to predict grasp poses in unstructured environments and for a wide range of previously unseen objects. Collecting a diverse and labeled dataset, however, can be time-consuming and costly. To overcome these challenges, we propose to use Neural Radiance Fields (NeRF) to generate RGB-D images and to combine these with a cutting-edge automatic-labeling approach to create data for training grasp prediction networks. The main contribution of this thesis is a novel method for obtaining individual NeRFs for objects of interest and backgrounds. The method requires two input scenes: a complete scene containing an object of interest and the same scene but without the object. The steps of the method include training a NeRF on the background scene, aligning it with the object scene, combining it with another NeRF to be trained on the object scene, and joint optimization of both NeRFs with depth regularization loss added to NeRF loss. By applying this approach to various datasets, it is possible to create a library of trained object and background NeRFs. Arbitrary combinations of these NeRFs can then be used to generate novel scenes and render synthetic images for training detection networks. In a comprehensive ablation study, we employ our approach to create four distinct datasets, apply an automatic labeling pipeline to them and use them to train corresponding grasp prediction networks. The results validate the viability of NeRF-generated data for training detection models, showcasing a performance nearly on par with real data. Furthermore, our approach unveils exciting potential for scalability by facilitating the generation of novel data. Overall, this research advances the field of robotic manipulation by proving the potential of using NeRF-generated synthetic data and novel scenes to train robust grasp prediction models for real-world applications.

Full text and
other links
Volltext
Department(s)University of Stuttgart, Institute of Visualisation and Interactive Systems, Visualisation and Interactive Systems
Superviser(s)Weiskopf, Prof. Daniel; Gabriel; Dr. Miroslav; Schulz, Dr. Christoph; Künzel, Sebastian
Entry dateMay 21, 2024
New Report   New Article   New Monograph   Department   Institute   Computer Science