Masterarbeit MSTR-2019-110

Bibliograph.
Daten
Prakash, Rohit: Hierarchical inverse reinforcement learning from motion capture data.
Universität Stuttgart, Fakultät Informatik, Elektrotechnik und Informationstechnik, Masterarbeit Nr. 110 (2019).
50 Seiten, englisch.
Kurzfassung

A human motion generally consists of multiple low-level tasks which are performed in a defined order or in parallel to achieve a high level task. For example, making pizza dough consists of several low-level tasks such as measuring water, adding yeast, measuring flour, etc. And these activities must be performed in a definite order to make the dough. For a system to imitate these sequence of activities of a long horizon task with one global reward function is a lot of work. This process can be made easier if there is hierarchical state representation of the task and learning of local rewards for the hierarchies. In this thesis, we have learned to imitate a general day to day human activity of ’setting table for one person’. This work adopts a framework called Hierarchical Inverse Reinforcement Learning (HIRL), which is a model to learn sub-task structure from demonstrations. With this framework, the activity is decomposed into multiple lower level tasks which are performed in a sequence using learned policies. In this work, Maximum Entropy Inverse Reinforcement Learning (MaxEnt-IRL) is used to learn local rewards for the sub-tasks. Together with hierarchical state space representation and local reward functions, the model encodes the high level task objective based on human demonstrations of full body motion performing the high level task. The model achieves a success rate of 84% on average for middle levels and 83% for the top level in the cross validation tests. For visualization, the model is simulated in a 2D representation that takes current environment state as input and runs till the completion of the task.

Volltext und
andere Links
Volltext
Abteilung(en)Universität Stuttgart, Institut für Parallele und Verteilte Systeme, Maschinelles Lernen und Robotik
BetreuerToussaint, Prof. Marc; Mainprice, Dr. Jim; Kratzer, Philipp
Eingabedatum21. März 2022
   Publ. Informatik