| Bibliography | Förster, Philipp: Inside the Black Box: Comparing Self-Play and Zero-Shot Coordination Agents in Hanabi. University of Stuttgart, Faculty of Computer Science, Electrical Engineering, and Information Technology, Master Thesis No. 94 (2025). 65 pages, english.
|
| Abstract | Reinforcement learning agents trained in cooperative game settings often achieve high self-play performance but struggle to coordinate with unfamiliar partners. This limitation becomes especially visible in the cooperative card game Hanabi, commonly used as a benchmark that requires communication under hidden information and implicit reasoning about others. While recent training methods such as Other-Play (OP) and Off-Belief Learning (OBL) improve over self-play (SP) in generalization to zero-shot coordination, it remains unclear when and how these algorithms succeed or fail: performance metrics alone do not reveal which internal mechanisms drive successful coordination. This thesis investigates the internal reasoning processes of Hanabi agents by combining input-level and representation-level explainability. First, we implement Layer-wise Relevance Propagation (LRP) for recurrent architectures to analyze which input features most influence an agent’s decisions. Second, we apply linear probing to examine what information, such as card properties or partner intentions, is encoded in hidden states. Finally, we connect these internal explanations to reveal observable conventions and coordination patterns. Our experiments show that SP agents strongly rely on learned static partner traits and develop brittle turn-level conventions instead of attending and adapting to partners dynamically, which explains their poor cross-play performance. In contrast, zero-shot agents like OBL and OP rely less on partner traits and more on grounded facts about the board state, resulting in more robust behavior. Probing further reveals that different training methods encode game information in systematically different ways, with auxiliary losses improving representation stability across random seeds. Overall, this work demonstrates that analyzing internal representations provides insights that are not visible from score alone. Understanding how coordination emerges, and why it breaks down, can guide the development of cooperative agents that generalize more reliably to new partners and real-world multi-agent scenarios.
|
| Department(s) | University of Stuttgart, Institute of Visualisation and Interactive Systems, Visualisation and Interactive Systems
|
| Superviser(s) | Bulling, Prof. Andreas; Kögel, Fabian; Ruhdorfer, Constantin |
| Entry date | March 16, 2026 |
|---|