| Kurzfassung | As AI planning systems grow increasingly complex and are applied to a wider range of real-world problems, the efficiency and quality of their underlying domain models become ever more critical. However, optimizing the structure of planning domain models remains a significant challenge in AI planning, as there is little in the way of standard procedures. This thesis investigates whether Large Language Models (LLMs) can automatically reorder domain files to measurably improve AI planner performance. A modular framework was designed and implemented that combines multiple state-of-the-art LLMs and automated validation, incorporating syntactic and semantic checks. The system generates domain configurations using diverse prompt styles and temperature settings, then evaluates their impact on efficiency across benchmark planning domains and a diverse set of planners. The results demonstrate that LLM selection has a substantially greater effect on output quality than prompt strategy or temperature settings, with models like GPT-4o achieving the highest rates of valid, semantically accurate, and performance-enhancing rewrites. However, not all automatically generated variants surpass the baselines, and the risk of unintended semantic changes persists. The process remains dependent on domain and planner characteristics, indicating that further research is needed to enable planner-specific adaptation, safeguard semantic correctness, and extend the approach to richer planning paradigms. Despite these limitations, the findings suggest that automated, LLM-driven domain rewriting can become a valuable preprocessing step in the planning pipeline. By establishing a transparent, extensible evaluation methodology, this research provides a blueprint for reproducible studies in LLM-driven domain optimization. It critically assesses current model limitations and outlines clear directions for future research.
|