|Angerbauer, Katrin: Exploring simplified subtitles to support spoken language understanding. |
Universität Stuttgart, Fakultät Informatik, Elektrotechnik und Informationstechnik, Masterarbeit Nr. 1 (2018).
165 Seiten, englisch.
Understanding spoken language is a crucial skill we need throughout our lives. Yet, it can be difficult for various reasons, especially for those who are hard-of-hearing or just learning to speak a language. Captions or subtitles are a common means to make spoken information accessible. Verbatim transcriptions of talks or lectures are often cumbersome to read, as we generally speak faster than we read. Thus, subtitles are often edited to improve their readability, either manually or automatically. This thesis explores the automatic summarization of sentences and employs the method of sentence compression by deletion with recurrent neural networks. We tackle the task of sentence compression from different directions. On one hand, we look at a technical solution for the problem. On the other hand, we look at the human-centered perspective by investigating the effect of compressed subtitles on comprehension and cognitive load in a user study. Thus, the contribution is twofold: We present a neural network model for sentence compression and the results of a user study evaluating the concept of simplified subtitles. Regarding the technical aspect 60 different configurations of the model were tested. The best-scoring models achieved results comparable to state of the art approaches. We use a Sequence to Sequence architecture together with a compression ratio parameter to control the resulting compression ratio. Thereby, a compression ratio accuracy of 42.1 % was received for the best-scoring model configuration, which can be used as baseline for future experiments in that direction. Results from the 30 participants of the user study show that shortened subtitles could be enough to foster comprehension, but result in higher cognitive load. Based on that feedback we gathered design suggestions to improve future implementations in respect to their usability. Overall, this thesis provides insights on the technological side as well as from the end-user perspective to contribute to an easier access to spoken language.
|Abteilung(en)||Universität Stuttgart, Institut für Maschinelle Sprachverarbeitung|
|Betreuer||Vu, Jun.-Prof. Ngoc Thang; Schweitzer, Dr. Antje; Adel, Dr. Heike|
|Eingabedatum||23. Mai 2019|