Master Thesis MSTR-2020-92

BibliographyNguyen, Son Tung: Representation learning of scene images for task and motion planning.
University of Stuttgart, Faculty of Computer Science, Electrical Engineering, and Information Technology, Master Thesis No. 92 (2020).
61 pages, english.
Abstract

This thesis investigates two different methods to learn a state representation from only image observations for task and motion planning (TAMP) problems. Our first method integrates a multimodal learning formulation to optimize an autoencoder not only on a regular image reconstruction but also jointly on a natural language processing (NLP) task. Therefore, a discrete, spatially meaningful latent representation is obtained that enables effective autonomous planning for sequential decisionmaking problems only using visual sensory data. We integrate our method into a full planning framework and verify its feasibility on the classic blocks world domain [26]. Our experiments show that using auxiliary linguistic data leads to better representations, thus improves planning capability. However, since the representation is not interpretable, learning an accurate action model is extremely challenging, rendering the method still inapplicable to TAMP problems. Therefore, to address the necessity of learning an explainable representation, we present a self-supervised learning method to learn scene graphs that represent objects (“red box”) and their spatial relationships (“yellow cylinder on red box”). Such a scene graph representation provides spatial relations in the form of symbolic logical predicates, thus eliminates the need of pre-defining these symbolic rules. Finally, we unify the proposed representation with a non-linear optimization method for robot motion planning and verify its feasibility on the classic blocks-world domain. Our proposed framework successfully finds the sequence of actions and enables the robot to execute feasible motion plans to realize the given tasks.

Department(s)University of Stuttgart, Institute of Parallel and Distributed Systems, Machine Learning und Robotics
Superviser(s)Mainprice, Dr. Jim; gz, Dr. zgur; Toussaint, Prof. Marc
Entry dateNovember 24, 2021
   Publ. Computer Science