Masterarbeit MSTR-2018-120

Bibliograph.
Daten
Balla, Irdi: Visual question answering for intuitive human-robot collaborations using compositional neural networks.
Universität Stuttgart, Fakultät Informatik, Elektrotechnik und Informationstechnik, Masterarbeit Nr. 120 (2018).
45 Seiten, englisch.
Kurzfassung

Visual and language cues are very useful for understanding the environment around us. For AI it is also imperative to be able to use text and images to reason about the situation that they are in. Building conversational AI that can share a common linguistic and perceptual understanding with us is both highly desirable and challenging. Recent advances in deep learning and computer vision provide a host of effective neural machines capable of solving end-to-end visual question-answering tasks. In this thesis we aim to explore, extend, and adapt these methods to the domain of intuitive human-robot collaborations. Such domains require robots, through natural language interface, to smoothly collaborate and assist humans in vision-based manipulation tasks. They are thus inherently symbolic, relational, and compositional. We therefore plan to investigate methods capable of i) interpreting and grounding natural language queries, in particular referential expressions, in visual descriptions, and ii) composing neural machines to answer compositional questions and understand commands in the domain of interest. To evaluate this work we will use the Total Difficulty Test.

Abteilung(en)Universität Stuttgart, Institut für Parallele und Verteilte Systeme, Maschinelles Lernen und Robotik
BetreuerHennes, Ph.D. Daniel; Ngo, Hung
Eingabedatum15. Februar 2022
   Publ. Institut   Publ. Informatik