Masterarbeit MSTR-2025-24

Bibliograph.
Daten
Mannanal, Sanjana Scaria: Feasibility of Adversarial Attacks Against Large Language Models That Interact With External Systems.
Universität Stuttgart, Fakultät Informatik, Elektrotechnik und Informationstechnik, Masterarbeit Nr. 24 (2025).
117 Seiten, englisch.
Kurzfassung

This thesis examines the susceptibility of Large Language Models (LLMs) to indirect prompt injection attacks. Our primary research objective is to investigate LLM suscepti- bility to indirect prompt injection attacks, when augmented with an external function. We do so by identifying trends in attack susceptibility across a diverse set of LLMs. While previous studies have explored LLM security and uncovered new indirect prompt injection attack vectors, there exists no research that explores an external function as a potential attack vector. Additionally, there remains a significant gap in understanding how susceptibility changes when prompts are presented in non-English languages—a gap that this thesis aims to address. We designed adversarial prompts in four languages—English, German, French, and Malayalam. We then injected them indirectly into various LLMs through an external function. Our analysis then evaluates susceptibility based on key factors, including model provider, model size, architecture and release date. Our experiments demonstrate that indirect prompt injection attacks were successful across multiple models, without requiring complex, multi-turn interactions. Additionally, we found that attack success rates varied depending on the language of the prompt. However, when analysing patterns of susceptibility across models, we observed no clear trends, highlighting the need for testing each model separately for vulnerabilities.

Abteilung(en)Universität Stuttgart, Institut für Künstliche Intelligent, Intelligente Wahrnehmung
BetreuerRoitberg, Jun.-Prof. Alina; Vu, Prof. Ngoc Thang; Kleber, Dr. Stephan; Eppler, Jeremias
Eingabedatum13. August 2025
   Publ. Informatik